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PESTICIDAL AGENTS 

The present invention relates to materials, agents and 
5 compositions having pesticidal activity which derive from 
bacteria , and more particularly from Xenorhabdus species. 
The invention further relates to organisms and methods 
employing such compounds and compositions. 

10 There is an ongoing requirement for materials, agents, 
compositions and organisms having pesticidal activity, 
for instance for use in crop protection or insect- 
mediated disease control. Novel materials are required 
to overcome the problem of resistence to existing 

15 pesticides. Ideally such materials are cheap to produce, 
stable, have a high toxicity (either when used alone or 
in combination) and are effective when taken orally by 
the pest target. Thus any invention which provided 
materials, agents, compositions or organisms in which any 

20 of these properties was enhanced would represent a step 
forward in the art . 

Xenorhabdus spp. in nature are frequently symbiotically 
associated with a nematode host, and it is known that 
25 this association may be used to control pest activity. 
For instance, it is known that certain Xenorhabdus spp. 
alone are capable of killing an insect ho6t when injected 
into the host's hemocoel. 

30 in addition, one extracellular insecticidal toxin from 
Photorhabdus luminescens has been isolated (this species 
was recently removed from the genus Xenorhabdus , and is 
closely related to the species therein). This toxin" is 
not effective when ingested, but is highly toxic when 

35 injected into certain insect larvae (see Parasites and 
Pathogens of Insects Vol.2, Eds. Beckage, N. E. et 
al», Academic Press 1993). 
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Also known are certain low-molecular weight heterocyclic 
compounds from P . luminescens and X . nematophilus which 
have antibiotic properties when applied intravenously or 
topically (see Rhodes, S,H. et al., PCT WO 84/01775). 

5 

Unfortunately none of these prior art materials have the 
ideal pesticide characteristics discussed above, and in 
particular, they do not have toxic activity when 
administered orally. 

10 

The present invention provides pe6ticidal agents and 
compositions from Xenorhabdus species, organisms which 
produce such compounds and compositions, and methods 
which employ these agents, compositions and organisms, 
15 that alleviate some of the problems with the prior art. 

According to one aspect of the present invention there is 
disclosed a method of killing or controlling insect pests 
comprising administering cells from Xenorhabdus species 
20 or pesticidal materials derived or obtainable therefrom, 
orally to the pests. 

A PCT application of CSIRO published as WO 95/00647 
discloses an apparently toxic protein from Xenorhabdus 
25 nematophilus ; however no details of the protein's 

toxicity are given, and certainly there is no disclosure 
of its use as an oral insecticide. 

Thus the invention provides an insecticidal composition 
30 adapted for oral administration to an insect, which 

composition comprises a pesticidal material obtainable 
from a Xenorhabdus species, or a pesticidal fragment 
thereof, or a pesticidal variant or derivative of either 
of these. 

35 

The composition may in fact comprise cells of Xenorhabdus 
or alternatively supernatant taken from cultur s of cells 
of Xenorhabdus species. However, the composition 
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preferably comprises toxins isolable from Xenorhabdus as 
illustrated her inafter. Toxic activity has be n 
associated with material encoded by the nucleotide 
sequ nee of Figure 2. Thus, the composition suitably 
5 comprises a pesticidal material which is encoded by all 
or part of the nucleotide sequence of Figure 2. 
Pesticidal fragments as well as variants or derivatives 
of such toxins may also be employed. 

10 The sequence of Figure 2 is of the order of 40kb in 

length. It is believed that this sequence may encode more 
than one protein, each of which may regulate or be 
insecticidal either alone or when presented together. It 
i6 a matter of routine to determine which parts are 

15 necessary or sufficient for insecticidal activity. 

As used herein the term ^variant" refers to toxins which 
have modified amino acid sequence but which share similar 
activity. Certain amino acids may be replaced with 

20 different amino acids without altering the nature of the 
activity in a significant way. The replacement may be 
by way of ^conservative substitution ' ' where an amino 
acid is replaced with an amino acid of broadly similar 
properties, or there may be some non-conservative 

25 substitutions. In general however, the variants will be 
at least 60% homologous to the native toxin, suitably at 
least 70% homologous and more preferably at least 90% 
homologous . 

30 The term " derivative ' * relates to toxins which have been 
modified for example by chemical or biological methods. 

These toxins are novel, and they and the nucleic acid6 
which encode them form a further aspect of the invention. 

35 

A preferred Xenorhabdus species is the bacteria 
X.nematophilus . Particular strains of X. nematophilus 
which are us ful in the context of the inv ntion are 
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ATTC 19061 strain, available from the National Collection 
of Industrial and Marine Bacteria, Aberdeen, Scotland 
( NCIMB ) . In addition, suitable strains include two novel 
strains of Xenorhabdus which were deposited at the NCIMB 
5 on 10 July 1997 and were designated with repository 
numbers NCIMB 40886 and NCIMB 40887. These latter 
strains form a further aspect of the invention. 

All strains have common characteristics as set out in the 
10 following Table 1 . 





Table 


1 

Strains 




Characteristics 


ATCC 19061 


NCIMB 40887 


NCIMB 40886 


Graiii strain 


negative 


negative 


negative 


Shape/size 


rods up to 


rods up to 


rods up to 




4\xm long 


4 jim long 


4pim long 


Motile 


Yes 


Yes 


Yes 


Bioluminescent 


No 


No 


No 


Colour on NBTA* 


blue 


blue 


blue 


insecticidal on 








ingestion by 


yes 


yes 


yes 


insects 








Production of 


yes 


yes 


ye 8 


Antibiotics 








Resistant to 








ampicillin 


yes 


yes 


yes 


(50ng/ml) 








colony 


circular 


circular 


circular 


morphology/ 


convex 


convex 


convex 


colour 


cream 


cream 


cream 



15 *NBTA (Oxoid nutrient agar containing 0.0025% 

bromothymol blue and 0.004% tetrazolium chloride) 

Preferably the pest target is an insect, and more 
preferably it is of the order Lepidoptera, particularly 
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Pierls brassicae , Pieris rapae , or Plutella xylostella or 
the order Diptera , particularly Culex quinquefaciatus. 

In a preferred embodiment of the invention, cells from 
5 Xenorhabdus species or agents derived therefrom are used 
in conjunction with Bacillus thuringiensis as an oral 
pesticide. 

In further embodiments, rather than using Bacillus 
10 thuringiensis itself, pesticidal materials obtainable 
from B . thuringiensis (e.g. delta endotoxins or other 
isolates) are used in conjunction with Xenorhabdus 
species . 

15 The term 'obtainable from* is intended to embrace not 

only materials which have been isolated directly from the 
bacterium in question, but also those which have been 
subsequently cloned into and produced by other organisms. 

20 Thus the unexpected discovery that bacteria of the genus 
Xenorhabdus ( and materials derived therefrom) have 
pesticidal activity when ingested, and that such bacteria 
and materials can be used advantageously in conjunction 
with B . thuringiensis (and toxins or materials derived 

25 therefrom), forms the basis of a further aspect of the 
present invention. The pesticidal activity of 
fl . thuringiensis isolates alone have been well documented. 
However, synergistic pesticidal activity between such 
isolates and bacteria of the Xenorhabdus species (or 

30 materials derived therefrom) has not previously been 
demonstrated. 

In still further embodiments of the invention, culture 
supernatant taken from cultures of Xenorhabdus species, 
35 particularly X. nematophilus , is used in place of cells 
from Xenorhabdus species in the methods above. 
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All of these methods can be employed, inter alia, in pest 
control . 

The invention also makes available pesticidal 
5 compositions comprising cells from Xenorhabdus species, 
preferably X-nematophilus , in combination with B. 
thuringiensls . As with the methods above, a pesticidal 
toxin from B . thurlngiensis (preferably a delta endotoxin) 
may be used as an alternative to B . thurlngiensis in the 
10 compositions of the present invention 

Likewise, culture supernatant taken from cultures of 
Xenorhabdus species, preferably, X . nematophilus may be 
used in place of cells from Xenorhabdus species. 

15 

Such compositions can be employed, inter alia, for crop 
protection eg, by spraying crops, or for livestock 
protection. In addition, compositions of the invention 
may be used in vector control. 

20 

The invention further encompasses novel pesticidal agents 
which can be isolated from Xenorhabdus spp. Techniques 
for isolating such agents would be understood by the 
skilled person. 

25 

In particular, such techniques include the separation and 
identification of toxin proteins either at the protein 
level or at the DNA level. 

30 The applicants have cloned and partially sequenced a 

region of DNA from Xenorhabdus NCIMB 40887 which region 
codes for insecticidal activity and this is shown as 
Figure 2 ( SEQ ID NO. 1) hereinafter. Thus in a preferred 
embodiment the invention also provides a toxin which is 

35 encoded by DNA of SEQ ID No. 1 or a variant or fragment 
thereof . 
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The invention also provides a recombinant DNA which 
encodes such a toxin. The recombinant DNA of the 
invention may comprise the sequence of Figure 2 or a 
variant or fragment thereof. Other DNA sequences may 
5 encode similar proteins as a result of the degeneracy of 
the genetic code. All such sequences are encompassed by 
the invention. 

The sequence provided herein is sufficient to allow 
10 probes to be produced which can be used to identify and 
subsequently to extract DNA of toxin genes. This DNA may 
then be cloned into vectors and host cells as is 
understood in the art. 

15 DNA which comprises or hybridises with the sequence of 
Figure 2 under stringent conditions forms a further 
aspect of the invention. 

The expression "hybridises with 1 * means that the 
20 nucleotide sequence will anneal to all or part of the 
sequence of Figure 2 under stringent hybridisation 
conditions, for example those illustrated in ^ Molecular 
Cloning*', A Laboratory Manual' 1 by Sambrook, Fritsch and 
Maniatis, Cold Spring Habor Laboratory Press, Cold Spring 
25 Harbor, N.Y. 

The length of the sequence used in any particular 
analytical technique will depend upon the nature of the 
technique, the degree of complementarity of the sequence, 

30 the nature of the sequence and particularly the GC 
content of the probe or primer and the particular 
hybridisation conditions employed. Under high 
stringency, only sequences which are completely 
complementary will bind but under low stringency 

35 conditions, sequences which are 60% homologous to the 

target sequence, more suitably 80% homologous, will bind. 
Both high and low stringency conditions are encompassed 
by the term ^stringent conditions*' used herein. 
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Suitable fragments of the DNA of Figure 2, i.e. those 
which encode pesticidal agents may be identified using 
standard techniques. For example, transposon 
5 mutagenesis techniques may be used, for example as 

described by H.S. Siefert et al., Proc. Natl. Acad. Sci. 
USA f (1986) 83 f 735-739. Vectors such as the cosmid 
cHRIMl , can be mutated using a variety of transposons and 
then screened for loss of insectidal activity. In this 
10 way regions of DNA encoding proteins responsible for 
toxic activity can be identified. 

For example, the mini-transposon mTn_3(HIS3) can be 
introduced into a toxic Xenorhabdus clone such as cHRIMl , 
hereinafter referred to as " clone l 1 , by electroporating 
cHRIMl DNA into E.coli RDP146 (pLBlOl ) and mating this 
strain with E.coli RDP146 (pOX38 ) , followed by E. coll 
NS2114Sm. The final strain will contain cHRIMlDNA with a 
single insertion of the transposon mTn3(HIS3). These 
colonies can be cultured and tested for insecticidal 
activity as described in Example 8 hereinafter. 
Restriction mapping or DNA sequencing can be used to 
identify the insertion point of mTn3.(HIS3) and hence the 
regions of DNA involved in toxicity. Similar approached 
can be used with other transposons such as Tn5 and mTn5. 

Site directed mutagenesis of cHRIMl as outlined in 
"^Molecular Cloning, A Laboratory Manual" by Maniatis, 
Fritsch and Sambrook, (1982) Cold Spring Harbor, can also 
30 be used to test the importance of specific regions of DNA 
for toxic activity. 

Alternatively, subcloning techniques can be used to 
identify regions of the cloned DNA which code for 
35 insecticidal activity. In this method, specific smaller 
fragments of the DNA are subcloned and the activity 
determin d. To do this, cosmid DNA can be cut with a 
suitable restriction enzyme and ligated into a compatible 
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restriction site on a plasmid vector, such as pUC19. 
The ligation mix can be transformed into E. coli and 
transformed clones selected using a selection marker such 
as antibiotic resistance, which is coded for on the 
5 plasmid vector- Details of these techniques are 

described for example in Maniatis et al, supra, (see 
p390-391) and Methods in Molecular Biology, by L.G. 
Davies, M.D. Dibner and J.F. Battey, Elsevier, (see p222- 
224) . 

10 

Individual colonies containing specific cloned fragments 
can be cultured and tested for activity as described in 
Example 8 hereinafter. Subclones with insecticidal 
activity can be further truncated using the same 
15 methodology to further identify regions of the DNA coding 
for activity. 

The invention also discloses an isolated pesticidal agent 
characterised in that the agent is obtainable from 

20 cultures of X. nematophilus or variants thereof, has oral 
pesticidal activity against Pieris brassicae , Pieris 
rapae and Plutella xylostella , is substantially heat 
stable to 55°C, is proteinaceous , acts synergis tically 
with B. thurlngiensis cells as an oral pesticide and is 

25 substantially resistant to proteolysis by trypsin and 
proteinase K. 

By * substantially heat stable to 55 °C is meant that the 
agent retains some pesticidal activity when tested after 
30 heating the agent in suspension to 55°C for 10 minutes, 
and preferably retains at least 50% of the untreated 
activity. 

By 'substantially resistant to proteolysis* is meant that 
35 the agent retains some pesticidal activity when, exposed 
to proteases at 30°C for 2 hours and preferably retains 
at least 50% of the untreated activity. 



SUBSTITUTE SHEET (RULE 26) 

5DOCID: <WO 9808368A1_I_> 



WO 98/08388 



PCT/GB97/02284 



10 

By 'acts synergistically 1 is meant that the activity of 
the combination of compon nts is greater than one might 
expect from the use of the components individually. For 
example, when used in conjunction with B . thurlngiensis 
5 cells as an oral pesticide, the concentration of B. 
thuringiensis cellular material necessary to give 50% 
mortality in a P.brassicae when used alone is reduced by 
at least 80% when it is used in combination the agent at 
a concentration sufficient to give 25% mortality when the 
10 agent is used alone* 

It has been found that the activity of the material is 
retained by 30 kDa cut-off filters but is only partly 
retained by 100 kDa filters. 

Preferably the agent is still further characterised in 
that the pesticidal activity is lost through treatment at 
25 °C with sodium dodecyl sulphate ( SDS - 0.1% 60 mine) 
and acetone (50%, 60 mins ) . 

Clearly the characterising properties of the isolated 
agent described above can be utilised to purify it from, 
or enrich its concentration in, Xenorhabdus species cells 
and culture medium supernatants . Methods of purifying 
proteins from heterogenous mixtures are well known in the 
art (eg. ammonium sulphate precipitation, proteolysis, 
ultrafiltration with known molecular weight cut-off 
filters, ion-exchange chromatography, gel filtration, 
etc.). The oral pesticidal activity provides a 
convenient method of assaying the level of agent after 
each stage, or in each sample of eluent. Such 
methodology does not require inventive endeavour by those 
skilled in the art. 

35 The invention further discloses oral pesticidal 

compositions comprising one or more agents as described 
above. Such compositions preferably further comprise 
other pesticidal materials from non-Xenorhabdus species. 
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These other materials may be chosen such as to have 
complementary properties to the agents described abov , 
or act synergistically with it. 

5 Preferably the oral pesticidal composition comprises one 
or more pesticidal agents as described above in 
combination with B. thuringiensis (or with a toxin 
derived therefrom, preferably endotoxin) . 

10 Recombinant DNA encoding said proteins also forms a 
further aspect of the invention. The DNA may be 
incorporated into an expression vector under the 
influence of suitable control elements such as promoters, 
enhancers, signal sequences etc. as is understood in the 

15 art. These expression vectors form a further aspect of 
the invention. They may be used to transform a host 
organism so as to ensure that the organism produces the 
toxin . 

20 The invention further makes available a host organism 
comprising a nucleotide sequence coding for a pesticial 
agent as described above. 

Methods of cloning the sequence for a characterised 
25 protein into a host organism are well known in the art. 
For instance the protein may be purified and sequenced: 
as activity is not required for sequencing, SDS gel 
electrophoresis followed by blotting of the gel may be 
used to purify the protein. The protein sequence can be 
30 used to generate a nucleotide probe which can itself be 
used to identify suitable genomic fragments from a 
Xenorhabdus gene library. These fragments can then be 
inserted via a suitable vector into a host organism which 
can express the protein. The use of such general 
35 methodology is routine. and non-inventive to those skilled 
in the art. Such techniques may be applied to the 
production of X norhabdus toxins other than those encoded 
by the sequence of Figure 2 . 
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It may be desirable to manipulate (eg. mutate) the agent 
by altering its gene sequence (and hence protein 
structure) such as to optimise its physical or 
5 toxicological properties. 

It may also be desirable for the host to be engineered or 
selected such that it also expresses other proteinaceous 
pesticidal materials (eg. delta- endotoxin from B. 
10 thuringiensis) . Equally it may be desirable to generate 
host organisms which express fusion proteins composed of 
the active portion of the agent plus these other toxicity 
enhancing materials. 

15 A host may be selected for the purposes of generating 

large quantities of pesticidal materials for purification 
e.g. by using B . thuringiensis transformed with the agent- 
coding gene. Preferably however the host is a plant, 
which would thereby gain improved pest-resistance. 

20 Suitable plant vectors, eg. the Ti plasmid from 

Agrobacterium tumefaciens, are well known in the art. 
Alternatively the host may be selected such as to be 
directly pathogenic to pests, eg. an insect baculovirus. 

25 The teaching and scope of the present invention embraces 
all of these host organisms plus the agents, mutated 
agents or agent-fusion materials which they express. 

Thus the invention makes available methods, compositions, 
30 agents and organisms having industrially applicable 
pesticidal activity, being particularly suited to 
improved crop protection or insect-mediated disease 
control . 

35 The methods, compositions and agents of the present 

invention will now be described, by way of illustration 
only, through reference to the following non-limiting 
examples and figures. Other embodiments falling within 
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the scope of the invention will occur to those skilled in 
the art in the light of these. 

FIGURE 

5 Figure 1 shows the variation with time of the growth of 
X. nematophilus ATCC 19061 and activity of cells and 
supernatants against P. brassicae as described in Example 
3. 

10 Figure 2 shows the sequence of a major part of a cloned 
toxin gene from Xenorhabdus . 

Figure 3 shows a comparison of the restriction maps of 
cloned toxin genes from two strains of Xenorhabdus 
15 (clone 1 above and clone 3 below). 



EXAMPLES 

20 

Example 1 - Use of X. nematophilus cells as an oral 
insecticide 

CELL GROWTH: A subculture of X . nematophilus (ATCC 19061, 
25 Strain 9965 available from the National Collections of 
Industrial and Marine Bacteria, Aberdeen, Scotland) was 
used to inoculate 250 ml Erlenmeyer flasks each 
containing 50 ml of Luria Broth containing lOg tryptone, 
5g yeast extract and 5g NaCl per litre. Cultures were 
30 grown in the flasks at 27 °C for 40hrs on a rotary shaker. 

PRODUCTION OF CELL SUSPENSION: Cultures were centrifuged 
at 5000 x a for 10 mins . The supernatants were discarded 
and the cell pellets washed once and resuspended in an 
35 equal volume of phosphate buffered saline (8g NaCl, 1.44g 
Na 2 HP0 4 and 0*24g of KH 2 P0 4 per litre) at pH 7.4, 
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ACTIVITY OF CELL SUSPENSION TO INSECTS: The bioassays 
were as follows: P. brasslcae: The larvae were allowed 
to feed on an artificial agar-based diet (as described by 
David and Gardiner (1965) London Nature, 207, 882-883) 

5 into which a series of dilutions of cell suspension had 
been incorporated. The bioassays were performed using a 
series of 5 doses with a minimum of 25 larvae per dose. 
Untreated and heat-treated (55°C for 10 minutes) cells 
were tested. Mortality was recorded after 2 and 4 days 

10 with the temperature maintained at 2 5°C. 

LC50 cells/g diet 

Treatment 2 days 4 days 

Untreated 5.9 x 10 5 9.8 x 10 4 

15 Treated 55°C 7.1 x 10 5 1.4 x 10 5 

Aedes aegyptl: The larva were exposed to a series of 5 
different dilutions of cell suspension in deionised 
water. The biosassays were performed using 2 doses per 
20 dilution of 50 ml cell suspension in 9.5cm plastic cups 
with 25 second instar larvae per dose. Untreated and 
heat-treated (55°C or 80°C for 10 minutes) cells were 
tested. Mortality was recorded after 2 days with the 
temperature maintained at 25 °C. 



25 



LC50 cells/ml 



Treatment 2 4*Y$ 

Untreated 5.1 x 10 6 

Treated 55°C 7.4 x 10 6 

30 Treated 80°C > 10 8 

Culex quinquefaclatus : The larvae were exposed to a 

7 

single concentration cell suspension containxng 4 xlO 
cells/ml. The biosassays were performed using 2 50 ml 
35 cell suspensions in 9.5 cm plastic cups with 25 second 

instar larvae per cup. Untreated and heat-treated (55°C 
or 80°C for 10 minutes) cells were tested. Mortality was 
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recorded after 2 days with the temperature maintained at 
25°C. 



% Mortality 
Treatment 2 days 

Untreated 100 
Treated 55°C 100 
Treated 80°C 0 



10 Thus these results clearly show that cells from X. 

nematophilus are effective as an oral insecticide against 
a number of insect species (and are particularly potent 
against P .brassicae) . The insecticidal activity is not 
dependent on cell viability (i.e is largely unaffected by 

15 heating to 55°C which reduces cell viability by >99.99%) 
but is much reduced by heating to 80 °C, which denatures 
most proteins . 

Example 2 - Use of X . nematophilus supernatant as an oral 
20 insecticide 

CELL GROWTH: Cultures were grown as in Example 1. 

PRODUCTION OF SUPERNATANT: Cultures were centrifuged 
25 twice at lOOOOg for 10 mins . The cell pellets were 
discarded. 

ACTIVITY OF SUPERNATANT TO INSECTS: The Bioassay was as 
follows : 

30 Activity against neonate P. brassicae and two day old 

Pieris rapae and Plutella xylostella larvae was measured 
as for P. brassicae in Example 1, but using a series of 
untreated dilutions of supernatant in place of of cell 
supensions and with mortality being recorded after 4 days 

35 only. 
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Insect species 
P. brassicae 



LC50 supernatant/g diet) 

4 days 



P. rapae 

P. xylostella 



79 



135 



10 



15 



20 



In addition, size-reducing activity (62% reduction in 7 
days) against Mamestra brassicae was detected in larvae 
fed on an artificial diet containing X. nematophilus 
supernatant (results not shown). 

Thus these results clearly show that the supernatant from 
X. nematophilus culture medium is effective as an oral 
insecticide against a number of insect species, and are 
particularly potent against P. brassicae . 

The heating of supernatants to 55°C for 10 minutes caused 
a partial loss of activity while 80°C caused complete 
loss of activity- Activity was also completely lost by 
treatment with SDS (0.1%w/v for 60 mins ) and Acetone (50% 
v/v for 60 mins) but was unaffected by Triton X-100 (0.1% 
60 mins), non-diet P40 (0.1% 60 mins), NaCl (1 M for 60 
mins) or cold storage at 4°C or -20°C for 2 weeks. All 
of these properties are consistent with a proteinaceous 
agent . 

The general mode of action of X. nematophilus cells and 
supernatants i.e. reduction in larval size and death 
within 2 days at high dosages, and other properties, eg. 
temperature resistence, appear to be similar suggesting a 
single agent or type of agent may be responsible for the 
oral insecticide activity activities of both cells and 
supernatants . 

Example 3 - Timescale for appearance of ingestable 
ins cticidal activity 
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CELL GROWTH: 1ml of an overnight culture of X. 
nematophilus was used to inoculate an Erlenmeyer flask. 
Cells were then cultured as in Example 1. Growth was 
estimated by measuring the optical density at 600 nm. 

5 

PRODUCTION OF CELL SUSPENSION AND SUPERNATANTS : These 
were produced as in Examples 1 and 2 . 

ACTIVITY OF CELLS AND SUPERNATANTS AGAINST P. BRASSICAE: 
10 The cell suspension bioassay was carried out as in 

Example 1, but using a single dose of suspended cells 
equivalent to 50 jjl of broth/g diet and measuring 
mortality after 2 days. The cell supernatant bioassay 
was carried out as in Example 2, but using a single dose 
15 equivalent to 50 pi supernatant/g diet (i.e. more than 
twice the LC50) and measuring mortality after 2 days. 

The results are shown in Fig. 1. Thus these results 
clearly 6how that cells taken from X. nematophilus 

20 culture medium are highly effective as an oral 

insecticide against P. brassicae after only 5 hours, and 
supernatants are highly effective after 20 hours. 
Although some slight cell lysis was observed in the early 
stages of growth, no significant cell lysis was observed 

25 after this point demonstrating that the supernatant 

activity may be due to an authentic extracellular agent 
(as opposed to one released only after cell breakdown). 

Example 4 - Synergy between X. nematophilus cells and 
30 B , thuringiensis powder preparations 

CELL GROWTH AND SUSPENSION: X. nematophilus cells were 
grown and suspended as in Example 1. B. thuringiensis 
strain HD1 (from Bacillus Genetic Stock Centre, The Ohio 
35 State University, Columbus, Ohio 43210, USA) was 

cultured, harvested and formulated into a powder as 
described by Dulmage et al.(1970) J. Invertebrate 
Pathology 15, 15-20. 
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ACTIVITY OF X. NEMATOPHILUS CELLS AND B. THURINGIENSIS 
POWDER AGAINST P. BRASSICAE : The bioassays was carried 
out using X. nematophilus and B. thurlngiensls in 
5 combination or using B . thuringiensls cell powder alone. 
Bioassays were carried out as in Example 1 but with 
various dilutions of B . thuringlensis powder in place of 
X. nematophilus . For the combination experiment , a 
constant dose of X. nematophilus cell suspension 
10 sufficient to give 25% mortaility was also added to the 
diet. Mortality was recorded after 2 days. 



These results clearly demonstrate the synergism between 
X. nematophilus cells and B . thuringiensis powder when 
20 acting as an oral insecticide against P. brassicae . 

Example 5 - Synergy between of X . nematophilus 
supernatants and B . thuringiensis powder 

25 CELL GROWTH AND PRODUCTION OF SUPERNATANTS: X. 

nematophilus cells were grown and supernatants prepared 
as in Example 2. B. thuringiensis was grown and treated 
as in Example 4. 

30 ACTIVITY OF X. NEMATOPHILUS SUPERNATANTS AND Bt CELL 
POWDER AGAINST P. BRASSICAE: 

The bioassays were carried out using X. nematophilus 
supernatants and B. thuringiensis in combination or using 
B. thuringiensis powder alone. The Bioassay against 
35 neonate P. brassicae and two day old Pieris rapae and 

Plutella xylost 11a larvae w re measured as in Example 2 
but with various dilutions of B. thuringiensis in place 
of X. nematophilus. For the combination experiment, a 



15 



Sipassay 

B.t. alone 

B.t. plus X. nematophilus 



LC50 (jL/g Bt powder/g diet) 
2 days 
1.7 



0.09 
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constant dose of X. nematophilus supernatant sufficient 
to give 25% mortality was also added to the diet. 
Mortality was recorded after 4 days, 

5 LC50 (/jg Bt powder/g) 

diet 

Insect soecies pt c^one Pt plus Xn 

P. brasslcae 1-4 0.12 

P. rapae 2.5 0*26 

10 P. xylostella 7.2 0.63 

These results clearly demonstrate the synergism between 
X . nematophilus supernatants and B . thuringiensis powder 
when acting as an oral insecticide against several insect 
15 species. The fact that both X. nematophilus cells and 

supernatants demonstrate this synergism strongly suggests 
that a single agent or type of agent is responsible for 
the demonstrated activities. 

20 Example 5 - Characterisation of insecticidal agent from 
X . nematophilus supernatant by proteolysis 

CELL GROWTH AND PRODUCTION OF SUPERNATANTS: X. 
nematophilus cells were grown and supernatants prepared 
25 as in Example 2 . 

PROTEOLYSIS OF SUPERNATANT: Culture supernatant (50ml) 
was dialysed against 0.5 M NaCl (3x11) for 48 hours at 
4°C. The volume of the supernatant in the dialysis tube 
30 was reduced five-fold by covering with polyethylene 

glycol 8000 (Sigma chemicals). Samples were removed and 
treated with either trypsin (Sigma T8253 = 10,000 
units/mg) or proteinase K (Sigma P0390 =10 units/mg) at 
a concentration of 0.1 mg protease/ml sample for 2 hours 

35 at 30°C. 

ACTIVITY OF PROTEASE TREATED SUPERNATANT AGAINST P. 
BRASSICAE : The boassay against neonate P. brassicae 
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larvae was carried out by spreading 25 jjI of each 
'treatment* on the artificial agar-based diet referred to 
in Example 1 in a 4 . 5 cm diameter plastic pot. Four pots 
each containing 10 larvae were used for each treatment, 
5 Mortalities were recorded after 1 and 2 days. Controls 
using water only, trypsin (0.1 mg/ml) and proteinase K 
(0.1 mg/ml) were also tested in the same way. 





% Mortality 




Treatment 


1 day 


2 days 


Untreated supernatant 


60 


100 


Proteinase K treated supernatant 


45 


100 


Trypsin treated supernatant 


40 


100 


All controls (no supernatant) 


0 


0 



15 

Example 6 

Ento^PCitiel frctj-yity of other Xsnorhzbdus 

Using the methodology of Examples 1 and 2, four different 

20 xenorhabdus strains were tested against insect pests. 

The results obtained were as follows: 

I) Activity to Pieris brassicae 

Strain deposit Cells 10 C /grm diet Supernatant LC50 

no/code % mortality ^tl/gram of diet 

NC 1MB 4 0887 100 0709 

0014 100 0.52 

0015 80 3.73 
NCIMB 40886 100 0.05 



25 it was found that entomocidal activity of cells and 

supernatant was reduced by more than 99% when all four 
strains were heated at 80°C for 10 minutes. 
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II) Activity to mosquitoes (Aecfes aegyptl) 
Bacteria added at the rate of 10 cells/ml of water 

Strain deposit Cells 10 /grm diet 

no/code % mortality 
NCIMB*'40887 0 

0014 40 

0015 45 
NCIMB 40886 95 



5 Furthermore, all strains significantly reduced the growth 
of Hellothls virescens. 

Example 7 

ClQning vt tQxin geneg fr-Qm strains p£ Xengrhabdus 

10 Total cellular DNA was isolated from NCIMB 40887 and ATCC 
19061 using a Quiagen genomic purification DNA kit. 
Cells were grown in L borth (lOg tryptone, 5g yeast 

extract and 5g NaCl per 1) at 28°C with shaking (150rpm) 
to an optical density of 1.5 A 6 qo- Cultures were 
15 harvested by centrif ugation at 4000xg and resuspended in 
3.5mls of buffer Bl (50mM Tris/HCl, 0.05% Tween 20, 0.5% 
Triton X-100, pH7.0) and incubated for 30 mins at 50°C. 
DNA was isolated from bacterial lysates using Quiagen 
100/G tips as per manufacturers instructions. The 

20 resulting purified DNA was stored at -20°C in TE buffer 
(lOmM Tris, ImM EDTA, pH 8,0). 

A representative DNA library was produced using total DNA 
of NCIMB 40887 and ATTC 19061 partially digested with the 
25 restriction enzyme Sau3a. Approximately 20^g of DNA from 

each strain was incubated at 37°C with 0.25 units of the 
enzyme. At time intervals of 10 , 20, 30, 45 and 60 
minutes, samples were withdrawn and heated at 65°C for 15 
minutes. To visualise the size of the DNA fragments, the 
30 samples were electrophoresed on 0.5% w/v agarose gels. 
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The DNA samples which contained the highest proportion of 
30 to 50kb fragments were combined and treated with 4 
units of shrimp alkaline phosphatase (Boehringer) for 15 
minutes at 37°C, followed by heat treatment at 65°C to 
5 inactivate the phosphatase . 

The size selected DNA fragments were ligated into the 
BamHl site of the cosmid vector SuperCos ! (Stratagent) 
and packaged into the Escherichia coli strain XL Blue 1, 
10 using a Gigapack II packaging kit (Stratgene) in 
accordance with the manufacturers instructions. 

To select for cosmid clones with entomocidal activity, 
individual colonies selected on L agar plates containing 

15 25|ag/ml ampicillin, were grown in L broth (containing 
25|ig/ml ampicillin) overnight at 28°C. Broth cultures 
(50^1) were individually spread onto the surface of 
insect diet contained in 4.5cm diameter pots, as 
described in Example 5. To each container 10 neonate P. 

20 brassicae larvae were added. Larvae were examined after 
24, 72 and 96 hours recording mortality and size of 
surviving larvae. A total of 220 clones of NCI MB 40887 
were tested, of which two were found to cause reduction 
in larval growth and death within 72 hours. Of 370 

25 clones from ATTC 19061, one was found to cause larval 
death within 72 hours. 

Example 8 

Activity of cloned toxin genes to Pieris brassicae 
30 The three active clones from Example 7 were grown in L 
broth, containing 25M-g/ml ampicillin, for 24 hours at 
28°C, on a rotary shaker at 150rpm. The activity of the 
toxin clones to neonate larvae were performed by 
incorporation of whole broth cultures into insect diet, 
35 as described in Example 1 . 
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Clone ejo 



bC5Q (ia brptn/q insect; tiist) 



1 



NCIMB 40887 



13.03 



2 



NCIMB 40887 



16.7 



3 



ATTC 19061 



108. 7 



Control* 



No effect at 100|ul/g 



10 



15 



20 



25 



*XL1 Blue E. coli broth 

When E . coli toxin clones were heated at 80°C for 10 

minutes and added to the diet at a rate of lOOjil/g, no 
activity to larvae was detected. Highlighting the heat 
sensitivity of the toxins. 

Example 9 

Sequencing of the cloned toxin from NCIMB 40887 

Cosmid DNA of the entomocidal clone 1 above from NCIMB 
40887 was purified using the Wizard Plus SV DNA system 
(Promega) in accordance with the manufacturers 
instructions. A partial map of the cloned fragment was 
obtained using a range of restriction enzymes EcoRl, 
BamEl, Hindi II, Sail and Sacl as shown in Figure 3. DNA 
sequencing was intiatiated from pUC18 and pUC19 based 
sub-clones of the cosmid, using the enzymes EcoRl , BamHl, 
Hindlll, EcoRV and PvuII. Sequence gaps were filled 
using a primer walking approach on purified cosmid DNA. 
Sequence reactions were performed using the ABI PRISM™ 
Dye Terminator Cycle Sequencing Ready Reaction Kit with 
AnunpliTaq DNA polymerase FS according to the 
manufacturers instructions . The samples were analysed on 
an ABI automated sequencer according to the manufacturers 
instructions. The major part of the DNA sequence for the 
cloned toxin fragment is shown in Figure 2. 
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Example 10 

Restriction map of cloned toxin from clone 3 

Cosmid DNA of the entomocidal clone 3 above was purified 

5 as described in Example 9. A restriction map of the 
cloned fragment was obtained using the restriction 
enzymes BamHl , Hindi II, Sail and Sacl and this is shown 
in Figure 3. When compared with the map from clone 1 
(Figure 3) it is clear that over the regions which 

10 overlap, the restriction maps are very similar. The 

only detectable difference between the two clones was a 
reduction in size of two Hindlll fragments in clone 3, 
corresponding to the 11.4kb and 7.2kb Hindlll fragments 
in clone 1 by approximately 2Kb and 200bp respectively. 

15 These results indicate the overall relatedness of the DNA 
region coding for toxicity in the two bacterial strains. 

Example 11 

Southern Blot Hybr idisation Experiments 

20 A l(K3kb BamHl-Sall fragment of the DNA from clone 1 was 
used as a probe to hybidise to total Hindlll digested DNA 
of the Xenorhabdus strains ATCC 19061, NCIMB 40886 and 
NCIMB 40887. Hybridisation was performed with 20ng/ml of 
DIG labelled DNA probe at 65°C for 18 hours. Filters 

25 were washed prior to immunological detection twice for 5 
minutes with 2 x SSC ( 0 . 3M NaCl, 30mM sodium citrate, pH 
7.0)/0.1% (w/v) sodium dodecyl sulphate at room 
temperature, and twice for 15 minutes with 0.1 x SSC 
(15mM NaClm 1.5 mM sodium citrate, pH 7.0) plus 0.1% 

30 sodium dodecyl sulphate at 65°C. The probe was labelled 
and experiments performed in accordance with 
manufacturers instructions, using a non-radioactive DIG 
DNA labelling and detection kit ( Boehringer ) . The probe 
hybridised to a Hindlll fragment of approximately 8kb in 

35 all three strains as well as an 11.4kb fragment in NCIMB 
40887 and an approximate 9kb fragment in both NCIMB 40886 
and ATCC 19061. These results show that strains NCIMB 
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40886 and ATCC 19061 contain DNA with close homology to 
the toxin gene of clone 1 above, confirming the 
similarity between the toxins produced by the three 
strains . 
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CLAIMS 

1. An insecticidal composition adapted for oral 
5 administration to an insect comprising a pesticidal 
material obtainable from a Xenorhabdus species, or a 
pesticidal fragment thereof, or a pesticidal variant or 
derivative of either of these. 

10 2. A composition according to claim 1 wherein the said 
pesticidal material comprises material encoded by the 
nucleotide sequence of Figure 2 or variant or fragment 
thereof, or a sequence which hybridises with said 
sequence . 

!5 

3. A composition according to claim 1 or claim 2 which 
comprises cells of Xenorhabdus . 

4. A composition as claimed in any one of the 

20 preceding claims which comprises supernatant taken from 
cultures of cells of Xenorhabdus species. 

5. A composition according to any one of the preceding 
claims wherein the Xenorhabdus species is Xenorhabdus 

25 nematophilus . 

6. A composition according to any one of claims 1 to 4 
wherein the Xenorhabdus species is ATCC 19061, NCI MB 
40886 or NCIMB 40887. 

30 

7. A composition as claimed in any one of the preceding 
claims which comprises a further pesticidal material not 
obtainable from Xenorhabdus . 

35 8 . A composition according to claim 7 wherein the said 
further pesticidal material comprises a material 
obtainable from B . thurlngiensis . 
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9. A composition according to claim 8 which further 
comprises cells of B. thuringiensis . 

10. A composition according to claim 8 wherein the 

5 pesticidal materials obtainable from B. thuringiensis 
comprises the delta endotoxin. 

11. A composition according to any one of the preceding 
claims which further comprises an agriculturally 

10 acceptable carrier. 

12 • A composition according to claim 10 wherein the 
carrier comprises items of insect diet . 

15 13. A method for killing or controlling insect pests , 
which method comprises administering to a pest or the 
environment thereof a composition according to any one of 
the preceding claims. 

20 14. A method as claimed in claim 12 wherein the pests 
are insects from the order Lepidoptera or Diptera. 

15. A microorganism comprising Xenorhabdus strain NCIMB 
40886. 

25 

16. A microorganism comprising Xenorhabdus strain NCIMB 
40887 . 

17. A pesticidal agent which comprises a a toxin 
30 comprising a protein which is encoded by DNA which 

includes SEQ ID No. 1 or a variant or fragment thereof. 

18. An isolated pesticidal agent characterised in that 
it is obtainable from cultures of X. nematophilus or 

35 mutants thereof, has oral pesticidal activity against 
Pieris brassicae , Pieris rapae and Plutella xylostella , 
is substantially heat stabl to 55°C / is proteinaceous, 
acts synergistically with B. thuringiensis cells as an 
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oral pesticide, and is substantially resistant to 
proteolysis by trypsin and proteinase K. 

19. An isolated pesticidal agent as claimed in claim 18 

5 further characterised in that the pesticidal activity is 
substantially destroyed by treatment with sodium dodecyl 
sulphate or acetone or heating to 80°C. 

20. An isolated pesticidal agent as claimed in claim 18 

10 or claim 19 further characterised in that the agent is an 
extracellular protein. 

21. A recombinant DNA which encodes a pesticidal agent 
according to any one of claims 17 to 20. 

15 

22. A recombinant DNA of claim 21 which comprises the 
sequence of Figure 2 or a variant or fragment thereof. 

23. A recombinant DNA which comprises or hybridises 
20 under stringent conditions with all or part of the 

sequence of Figure 2, and which encodes a pesticidal 
material . 

24. An expression vector comprising a recombinant DNA 
25 according to any one of claims 21 to 23. 

25. A host organism which has been transformed with an 
expression vector according to claim 24. 

30 26. A host organism as claimed in claim 25 which has been 
engineered or selected such that it also expresses other 
pesticidal proteinaceous toxicity enhancing materials 

27. A host organism comprising a nucleotide sequence 
35 coding for a fusion protein comprising a pesticidally 
active portion of an agent as claimed in any one of 
claims 17 to 2 0 in combination with other pesticidal 
proteinaceous toxicity enhancing materials. 
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28. A host organism as claimed in claim 27 wherein the 
pesticidal toxicity enhancing materials comprise delta- 
endotoxin from B. thuringiensis . 

5 

29. A host organism as claimed in any one of claims 25 to 
289 wherein the host is a plant. 

30. A host organism as claimed in any one of claims 25 to 
10 28 wherein the host is a virus pathogenic to insects. 

31. A fusion protein as expressed by a host as claimed in 
claim 27. 

15 32. An pesticidal composition comprising one or more 
agents a6 claimed in any one of claims 17 to 20. 
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Fig.2. 

1 TCCACAATTG CCGGAGAAAA TCAGTCGGGA ACTGCCGGTG ATTATTCGTC ACTTATTAAA 

61 CGAATTTGCC G AC CAGAAT A AGGCTAAAAA ACTGCTACAG GCGCAACGCG ACTCGAACGA 

121 AG CGTTAACG GTAAAGAGTC ATTCGGATCC GCTGTATCGC TTTTGTGGTT ATCTGGTGTC 

181 TGTCAATGAT ATGACCGGAA TGAAGATGGG CAATAAAAAC ATT AG CCCAC GAGCACCGAG 

241 ATTG TACTTG TATCATGCCT ATCTCTCTTT TATGGAAGCG CACGGCTTTG AACGTCCGTT 

3 01 AACACTGACT AAGTTTGGTG AATCCATCCC CAAGATTATG CTGGAATACC GGAAGGAGTA 
361 TCGAAAAGTG CGAACCAAGA AAGGCTATTC CTATAACGTG GAATTATCGG AAGAGGCCGA 
421 AG AATGG CTA CCGTCAGTGC CTGAGTGTCG AGACTTTAAA TCACCTGTAT AAAACTTTGA 

4 81 GCTTTAAGTC TGCACTCCAT ACACAACTTA AAATATCTAA TTGTATTTAA AAGAAAATAA 
541 TAGATGTATA GTTATTTTTT AACTATACAT AAGCTCTACA TGCTCTTCAT TCGTGTAAAA 
601 AATGGGTGAA CAGGTG AT A C AGTCAGTGAA TATCATATTA ATTACCGTAA ACCCAGATGT 
661 AGCAAGGCTT TCAGGGAATT GTGCAGAGGG TGCATAACTG AGAGGGTGAA AAAGATTTTC 
721 AGGGGGGCTT ATGG CAGGTA AACAAAATCA GAAGCAAATA CCGTGCACAA TCTGGTTTTT 
781 ATTTTTTGGT A CTACCTCAA ATTAAAATGA TGTAATCATC TGATTTTATT TAAGAATAGA 
841 AGTTAATCAC AATTTCATTG ATGGACTTTC ATTCACACTG GTATAGATAA ATAATTCTGT 
90 1 TATATCCTGT TTCATTACGC ATTCATCAGG AGTGCTGTTA CAGGAGACAA GAATGTCACA 
961 CATCATTTAC TTGTCGTTAA AGGGCAAGAA GCAGGGTTTA ATTTCAG CGG GTTGTTCAAC 

1021 GCCTGAATCA ATTGGAAATC GCTATCAAAA AGGACGTGAA GATCAAATAC AGGTATTGAG 

10&: CCTGAATCAT TCGATGAGCC GTGACCAGAA TGTTAATCAT CAACCCGTCA GTTTTGTGAA 

114' ACCCATTGAT AAATCCTCTC CCCTGTTTGC TGGATGCCAG TTTTGTG CAT TACAGGACAA 

12 C" G CCA 3 ATGG G ACAACTGGAG TTCTTTTATG AAATCAAGCT GACCAGTGCC ACGATTGTGG 

12 € 1 A TATTTC CTA TAATTATCCG GCATTCAATC AATGATAATG GTGCGATACC CCATGAAGTG 

13 21 GTGATGCTCG ATTATAAGTC CATTTCATGC AACCACATCG CCGCAGGACT TCGGGCTACA 

13 81 G CAT A CG CAA TTAGCCGGAA GTGAAGAAGC AAGCCGCTTT TATCTGGGGT CTCGAATGTT 

14 41 AAGCCACTTA AGAAGCCGCT GGTTGAAGAA ACCCCGGTAA AACCCGCTAA ACATCATGCC 
1501 CGTTATCGTT GTGTGGATGA TGACGGCAAT CTTTTAACCG AACGCAAGTA TCGGG TTTGC 

15 61 CTGCCGGATG GTCAGATAAA AGAAGGAAAG ACTG ATAA AC AAGGTTACAC CCAATGGCAT 
1621 CTTACGGATG ACAAAAATAA ACTTGAATTT CATATTTTAA AGGATTAATA CCATGCCAGC 
16S1 CTATACCGTT CAGACAAAAA TAGAATCCAA CGTACCTGTT GAAAACCTGC TTTACGACTT 
1741 AACCATTTAT CGTAAGGATG CAAAAGGAAA TTTCCATATC TTGCTTGATG TTTTTCAGGA 
1801 GAAA^TACAG AGTAATTATG AAACACAACA GCATATCAC3 CAGGAAATAG ACGACGATCT 
1861 TTCTGTGATT TATATTATGC AAATTATGCT TCACCGCAAA CATGGCTCAA ATATATTTCC 
1921 GGCA CTGCAA ACCCATTTTA AG AAAATG TA TAC CCTC GGT GAATTAACTT C CGG TAAAGC 
19 81 CTGTTCGGAG AAAAAACGGG AAAATG C CT G TTATTTTGAA AGTACAGTTG AAACAAAACC 
2 041 TGTCAGCGAC GGGGATAATA CCGTTGACTT AAATATCACT ATTCCTGAAC GACCTTTTAT 
2101 TGCCAAAGAA TATCCCATTG GTCACCCACA CGATCCATTT GAAAAAAGTA AAATTGAATC 
2161 ATAAATACAG GACAGGTTAT CGAAAAGAAT TTAT 3 CGG AT CAAAATGGAG CAAGTTTATG 
2221 TCAGGGCGCG AG CAC ACTAT TTTAG CTGCG TTTTTAAGAT GATTATCTCT TAATGTTCAG 
2281 TTTTAATAGT GTTTTTATCG AGTGAAATTT AATCGCACAG GCAATTCTTT AGACTTTTAT 
2341 AGAAAACTAA AGAATTAAAG AACAAGATTG ACATTTTAAG TTCAAATATT AATCAAAGTA 
2401 TGCTCGCGCC CTGAGTTTAT GTGGCCCTGC CGCTTTTTTT TATTGCCTGC CAATAGATAG 
24 61 ACCAGATATT TATGAGCAAG CGGCACGAGA ATTATGGCAA T ATGG C CG AA CTAAAATTGG 
2521 TCAA CTGGAA ATTAAGCCGG GTGAGGGTTG CCGACATCCT AAAGGTACTT TTTATAATCA 
2 581 ATATGGTGAA AGAATATCTG GGTTAGATTG GCTGACATTG GCAAGCCTAA GAGATTCAGA 
2641 AAATATGATG ATGAGGTTGA TGATGAAGTA GCTGGTATTA CAATGTGGGG AAAATTGACA 
2701 GAATGGTTTG AAAAATCAGG GTATGAAAAA GTATTTAGTA ATGTCGGCTT ATCCCATTCT 
27 61 AATATAAATG ACATAGTAAC TCTTAGTGAT TACTATAACA AAGGATATCA TGTTGTTACT 
2821 TTG ATTTCAG CAGGAATGTT ATCAGATTTT GGTG A CAT AG AAACATCAGG AAAAAATCAT 
2 881 TGGATAGTTT GGG AAGGAGT AGTAGAAAAC TATG AG AAAG AAAATATCAC AAATAATTCA 

2 941 GATCTGAATC AATATGTAAA TTTAAATCTG TTTTCATGGG GTAAAGTGGA ACATCAAATT 

3 001 AAAAAAAACA AATCACTAGA TTATGTACTC AACCATATTT TTTGAGGGTT GGTTTTTAAA 
3061 CCAATGAAAT AACATGAAAA AAATATTAAT TATTTTTATT TTTTTACTTT ATGGTTGTGG 
3121 TAAT Z C AACG CCAAAAGTTT TACCAAAATC AGAGTTTCTT CCTGATGCAG TGATAAATGA 
3181 ACCATATCAG G CATCAATTA CCATCACAGG AGGTG CATTG AATGAAAAAA GCGTTTGGGT 
3241 AAAAA.TT C AT CCTACTGGCT CAGGACTAAC ATGGAATCCA AAAGATAGTT CTTTCCTATA 
3 3 01 GGGTGGAAAA AAAGAAATAA GAAAAGATTA TCATCATATA AATATAACAG GTACCCCAAA 

33 61 GAAGACAGAA TTGATAAAAA TTGAAGTGGT AGGATTTACA TTGGGTACAA TGTACGCA CG 

34 21 GAAAGAGTTC ACTATAAATT ATACTATAAA AGTAAGGGAA TAATTGTCAC TATCAGAATG 
3481 GTGATTTAAT TCG CCATTTT TATA CTT TTG TATACTCTCT CAACATAATC AGGATTCTTT 
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3541 CTTATTATTT TTCATGGTGC TAAAAACGTT TATTGCAAAA ATAAATTAAG TTAATCAGAT 

3 601 AAATTATCTG CATTACTGTT ATAATCGATA ACACGATAAC CTGACTTTCT GCCTGTTCTT 

36 61 ATGAACTCGA AGATAATCCT TTCTGAGCCT GAACGAATCA CATTGCAACC ACTCGCTTTG 
3721 AATCACCCAC ACCGGGACAT TCGTACGCGA GGAACGGGTT TACTCATGCT TGCCAGAGGG 

37 81 AGCAAGCCGT CCCAGATCAC CGCTGAAATC GGATGCAGTC TCCGGGTTAT CTGTAATTGG 
3841 GTTCACATGT GGCACAGATA GCGGGATTAT TCGGCGGTCA TGCCGGAGGC CGGTATCTCG 
3901 CCATGACGCC TGACATGATT GCCACTGCGC TCGAAGCCGC CAGCGCAGAG TCCCTGACGT 
3961 GCGTCGAAGC CAGGCAGGGT TTCCCTGCCT TGTACGCTTG AAACGCTGGC GAATACCCTG 

4 021 AAAAAACAGG GGCTCCCCTA TAAACGCCCC CGCCTGTCGC TTAAAAAAAG CGCAATAAAA 
4081 CGGAGTTTGC TGAAAAATCC GCCTTGCTGA ATAAAATTAA GG CCGGAGCA CAGTCAGGAC 
4141 ATTACCGTCT GGTCTATTTT GAGTTCTGGG GGCGTTAAAT TACACGGATA ACACGCTGTT 
4201 TTACCAGACA ACGTCAGGCA GTATCACGCG AGATGACGTG ATTGATTTTT TAGAGCCGGT 
4261 GGCCAGACAA GGGACAACCG CCTGACATTT TTAGTGTTGG ATAATG CG CG TATCCATCAC 
43 21 GGGATAGAGG AAAAAATCAG AAATGGCGGG TGACGAGAAC ACAACCTGTT TTTATTCTAT 
4 381 CTTCCCGCTT ACAGCCCAGA GCTGTATCTG ATTGAAATCG TCTGGAAACA GGCCAAATAC 
4441 GACTGGCGAC GTTTTATCAC CTGGACTCAG GA TACAA TGG AATATGAGGT AAATACTTTA 
4501 TTGAAAGGTT ATGGCGACCA ATTTGCAATT AACTTTTCTT GAGTACTTAG TAAGAATAGA 
4561 GTCAGTCGAG GTTTTTTCAT TTCGGGTCGT GGGGATGATA CTGAAAATTT GTTTGTAATC 
4621 TCTGAAAATT GCTGTTTCTG TGGCTACGTC TGTCTTTTGG GATATTGTTT CCATCAAGTC 
4681 TGTCAACATA CTGTTAAGTT AGATGTTGAT AAAAG AG A CT GAATTATAAT ACAAAACAAT 
4 741 AAATCACTTG GACAATATTT TATTTCACAT GAGACATTAA GGTTGATTTT CCCAATCTGG 
4801 TCAGTTATAA CCGAATAAGG ATCTTGAAAA AT CATGGG AT CTTACTTTTA TCAAATGAAG 
4 861 TTAACG TAAA AGTTGATAAA GAAAATTATT TAATTCTAAG TGCCGTTGGC ATAAATATTT 

4 921 TGTGTTTTGT TAATGAATGA ATAACCAGGT AAGCTGGATT TTCATTTTTT AATTACTCGT 
49 81 TACAATATGC TATTTATTTA TATAAAGAGT TTGTGCCCAT TTAACCAGTA AACAAATTTG 
5041 TTCAACCGTA ACTTAGCTTC ATCGACTTTT GGCCTCGCCT GGTCAGAATC TAGGGCCGTT 
5101 ATCCTATTTA TTTATGATAA ATAAAATTTA ATTATCTTTA ATAAGCTGAA TATGTGGATT 
5161 TGTGCTCAAT CTTGGATTCA AGTATGTATT CCTTTTGGTA CCCTG CTTTA TTTTAAGGCA 
5221 GATGAAGAGG ATGCCAACAT GACACAATAT CG ATTACGAC TGTAACATTA AAGTCAGTTA 
5281 TAAATTTTAT GATTAAAATG AAATTTTAGT AGAAAATCGT ATTCTATTCC GCCATTTACA 
5341 ATAGCATCCT CTTTAATATC ATTAATCTCA GATAAAACAA ATAATTACAA TGTGAATAGA 

54 01 ATAATGACTT ACAAAATAAG CACTAAATCT TCAGATGAAC TCTTAACTGA CAACACTATT 
5461 TTATAAAATA ATTGAGGTTA TTATGTATAG CACGGCTGTA TTA CTCAATA AAATCAGTCC 

5 521 CACTCGCGAC GGTCAGACGA TGACTCTTGC GGATCTGCAA TATTTATCCT TCAGTGAACT 

55 81 GAGAAAAATC TTTGATGACC AGCTCAGTTG GGGAGAGGCT CGCCATCTCT ATCATGAAAC 
5641 TATAGAGCAG AAAAAAAATA ATCGCTTGCT GGAAGCGCGT ATTTTTACCC GTGCCAACCC 
5 701 ACAATTATCC GGTGCTATCC GACTCGG TAT TGAACGAGAC AGCGTTTCAC GCAGTTATGA 
5 761 TGAAATGTTT GGTGCCCGTT CTTCTTCCTT TGTGAAACCG GGTTCAGTGG CTTCCATGTT 

5 821 TTCACCGGCT GGCTATCTCA CCGAATTGTA TCGTGAAGCG AAGGACTTAC ATTTTTCAAG 
5881 CTCTGCTTAT CATCTTGATA ATCGCCGTCC GGATCTGGCT GATCTGACTC TGAGCCAGAG 
5941 TAATATGGAT ACAGAAATTT CCACCCTGAC ACTGTCTAAC GAACTGTTGC TGGAGCTATT 
6001 ACCCGCAAGA CCGGAGGTGA TTCGGACGCA TTGATGGAGA GCCTGTCAAC TTACCGTCAG 
6061 GCCATTGATA CCCCTTACCA TCAGCCTTAC G AG ACTATC C GTCAGGTCAT TATGACCCAT 
6121 GACAGTACAC TGTCAGCGCT GTCCCGTAAT CCTGAGGTGA TGGGGCAGGC GGAAGGGGCT 
6181 TCATTACTGG CGATTCTGGC CAATATTTCT CCAGAACTGT ATAACATTTT GACCGAAGAG 
6241 ATTACGGAAA AGAACGCTGA TGCTTTATTT GCGCAAAACT TCAGTGAAAA TATCACGCCC 
6301 GAAAATTTCG CGTCACAATC ATGGATAGCC AAGTATTATG GTCTTGAACT TTCTGAGGTG 
6361 CAAAAATACC TCGGGATGTT GCAGAATGGC TATTCTGACA GCACCTCTGC TTATGTGGAT 
6421 AATATCTCAA CGGGTTTAGT GGTCAATAAT GAAAGTAAAC TCGAAGCTTA CAAAATAACA 
6481 CGTGTAAAAA CAGATGATTA TGATAAACAT GTAAATTACT TTGATCTGAT GTATGAAGGA 
6541 AATAATCAAT TCTTTATATG TGCTAATTTT AAGATATCGA GAGAATTTGG GGCGACTCTT 

66 01 AGGAAAAACT CAGGGACAAG TGGCATTGTC GGCAGCCTTT CCGGTCCCCT GGTAGCCAAT 
6661 ACTAATTTCA AAAGCAATTA CTTAAGTAAC ATATCTGATA ATGAATACAG AAATGGCGTA 
6721 AAAATATATG CCTATCGCTA TACGTCTTCC ACCAGCGCCA CAAATCAGGG CGGCGGAATA 

67 81 TTCACTTTTG AGTCTTATCC CCTGACTATA TTTGCGCTCA AACTGAATAA AGCCATTCGC 

6 841 TTGTGCCTGA CTAGCGGGCT TTCACCGAAT GAACTGCAAA CTATCGTACG CAGTGACAAT 
6901 GCACAAGGCA TCATCAACGA CTCCGTTCTG ACCAAAGTTT TCTATACTCT GTTCTACAGT 
6961 CACCGTTATG CACTGAGCTT TGATGATGCA CAGGTACTGA ACGGATCGGT CATTAATCAA 

7 021 TATGCCCGAC GATGACAGTG TCAGTCATTT TAACCGTCTC TTTAATACCC CGCCGCTGAA 
70 81 AGGGAAAATC TTTGAAGCCG ACGGCAACAC GGTCAGCATT GATCCGGATG AAGAACAATC 
7141 TACCTTTGCC CGTTCAGCCC TGATGCGTGG TCTGGGGATC AACAGTGGTG AACTGTATCA 
7201 GTTAGGCAAA CTGGCGGGTG TATTGGACAC ACAAAATATC CTCACACTTT CTGTCCCTGT 

72 61 TATATCTTCA CTGTATCGCC TCACGTTACT GGCCCGTGCC CATCAGCTGA CGGTTAATGA 

73 21 ACTGTGTATG CTTTATGG TT TTTCGCCGTT CAATGGCAAA ACAACGGCTT CTTTGTCTTC 
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73 81 CGGGGAGTTG TCACGGCTGG TTATCTGGTT GTATCAGGTG ACGCAGTGGC TGACTGAGGG 

7441 CGGAAATCAC CACTGAAGCG ATCTGGTTAT TATGTACGCC AGAGTTCAGC GGGAATATTT 

7 501 CACCGGAAAT CAGTAATCTG CTTAATACTC TCCGACCCCG TATTAGTGAA GACATGGCAC 
75 61 AAAGTAGTGA CCGGGAGCTT CAGGCTGAAA TTCTCGCGCC GTTTATTGCT GCAACGCTGC 
7621 ATCTGG CGTC A CCAGATATG GCGCGGTATA TCCTGTTGTG GACTGATAAC CTGCGGCCGG 
7681 GCGGCCTGAA TATCG CCGGA TTTATGATGC TGGTGCTGAA AGAGACGCTG AGTGATGAGG 
7741 AAACGACCCA ACTGGTTCAA TTCTGCCATG TAATGGCACA GTTATCGCTT TCCGTGCAGA 
7801 CACTGCGTCT CAGTGAAGCA GAG CTTTCTG TGCTGGTCAT TTCCGATTTT GTGGTACTGG 
7861 GTGCGAGAAG CCAACCGCCG GACAACACAA TATTGATACT CTGTTCTCAC TCTACCGATT 
7921 CCACCAGTGG ATTAATGGGC TGGGAAATCC CGGCTCTGAC ACGCTGGATA TGCTGCGCCA 
79B1 AG CAGACACT CACGGGCGAC AGACTGGGCC TCCGTGATGG GGCTGGACAT CAGTATGGTA 

8 041 ACGCAGGCCA TGGGTTCC CG CCGGCGTGAA CCAACTTCAG TGTTGGCAGG ATATCAACCC 
8101 CGTGTTG CAG TGGATACATG TGGCATCAGC ACTGCTCACT GATGCCGTCG GTTATCCGTA 
8161 CGCTGGTGAA TATCCGTTAC GTGACTGCAT TAAACAAAGC CGAGTCGAAT CTGCCTGCCT 
82 21 GGGATAAGTG GCAGACGCTG GCAGAAAATA TGGCAGCCGG ACTGAGTACA CAACAGGCTC 
82 81 AG ACG CTGGC GGATTATACC GCAGAGCGCC TGAGTAACGT GTTGTGCAAT TGGTTTCTGG 
8341 CGAATATCCA GCCAGAAGGG GTGTCCCTGC ACAG C CGGGA TGACCTGTAC AG CTATTTCC 
84 01 TGATTGATAA TCAGGTCTCT TCTGCCATAA AAACCACCCG ACTGGCAGAG GCCATTGCCG 
84 61 GTATTCAGCT CTACATCAAC CGGGCGCTGA ACCGGATAGA GCCTAATGCC CGTGCCGATG 
8521 TGTCAACCCG CCAGTTTTTT ACCGACTGGA CGGTGAATAA CCGTTACAGC ACCTGGGGCG 

8 581 GGGTGTCGCG G CTGG TTTAT TATC CGG AAA ATTACATTGA CCCGACCCAG CGTATCGGGC 
8641 AG A CC CGG AT GATGGATGAA CTGCTGGAAG ATATCAG CCA GAGTCAGCTC AGCCGGGACA 

87 01 CGGTGGAAGA GGCCTTTAAA ACTTACCTGA CCGCTTTGAA ACCGTGGCAG AC CTG AAAGT 
8761 TG T CAG CG CT ATCACCGACA ACGTCAACAG CAACA CCGGA CTGACCTGGT TTGTCGGCCA 
8821 AACGCGGGAG AACCTGCCGG AATATTACTG GCGTAACGTG CATATATCAC GGATGCAGGC 

88 81 GGGTGAACTG GCCGCCGATG CCTGGAAAGA TTGG A CGAAG ATTGATACAG CGGTCAACCC 
8941 ATA CAAGG AT GCAATACGTC CGGTCATATT CAGGGAACGT TTGCACCTTA TCGTGGGTAG 

9 001 AAAAAGAGGA AGTGGCGAAA AATGGTACTG ATCCGGTGGA AACCTATGAC CGTTTTACTC 
9061 TG AAA CTGGC GTTTCTGCGT CATGATGGCA G TTGG AGTGC CCCCTGGTCT TACGATATCA 
9121 CAACGCAGGT GGAGGCGGTC ACTGACAAAA AACCTGACAC TGAACGGCTG GCGCTGGCCG 
9181 CATCAGGCTT TCAGGGCGAG GATACTCTGC TGGTGTTTGT GTACAAAACC GGGGTGAGTT 
9241 ACCCGGATTT TGGCGACAAC AATAAAAATG TGGCAGGCAT GACCATTTAC GGCGATGGCT 
9 301 CCTTCAAAAA GATGGAGAAC ACAGCACTCA GCGTTACAGC CAACTGAAAA ATACCTTTGA 

93 61 TAT CATTCAT ACT CAAGG CA ACGACTTGGT AAGAAAGGCC AGCTATCGTT TCGCG CAGG A 
9421 TTTTGAAGTG CCTGCCTCGT TGAATATGGG TTCTGCCATC GGTGATGATA GTCTGACGGT 

94 81 GATGGAAAAC GGG AATATTC CGCAGATAAC CAG TAAAT A C TCCAG CG ATA ACCTTGCTAT 
9541 TACGCTACAT AACGCCGCTT TCACTGTCAG ATATGATGGC AGTGG CAATG TCATCAGAAA 
9601 CAAACAAATC AGCGCCATGA AACTGACGGG GTTGGATGAA AGTCCCAGTA CGG CAATG C A 
9661 TTTATCATCG CAAATACCGT TAAACATTAT GGCGGTTACT CTGATCTGGG GGGCCCGATC 
97 21 ACCGTTTTTA TTAAAACGGA AAAACTATAT TGCATCAGTT CAAGG CCA CT TGATGAACGC 
9781 AGATTACACT AGGCGTTTGA TTCTAACACC AGTTGAAAAT AATTATTATG C CAG ATTGTT 
9841 CGAGTTTCCA TTTTCTCCAA ACACAATTTT AAACACCGTT TTCACGGTTG GTAG CAATAA 
9901 AACCAGTGAT TTTAAAAAGT GCAGTTATGC TGTTGATGGT AATAATTCTC AGGGCTTCCA 
9961 GATATTTAGT TCCTATCAAT CATC CGG CTG G CTGG AT ATT GACACAGGTA TTAACAATAC 

10021 TGATGTCAAA ATTACGGTGG TAG CTGG CAG TAAAACCCAC ACCTTTACGG CCAGTGACCA 

10081 TATTGCTTCC TTGCCGG CAA ACAGTTTTGA TGCTATGCCG TACACCTTTA AG CCA CTGG A 

10141 AATCGATGCT TCATCGTTGG CCTTTACCAA TAATATTGCT CCT CTGG ATA TCGTTTTTGA 

10201 GACCAAAGCC AAAGACGGGC GAGTGCTGGG TAAGATCAAG CAAACATTAT CGGTGAAACG 

10261 GGTAAATTAT AATCCGGAAG ATATTCTGTT TCTGCGTGAA ACTCATTCGG GTGCCCAATA 

10321 TATGCAGCTC GGGGTGTATC GTATTCGTCT TAATACCCTG CTGGCTTCTC AA CTGG TATC 

10381 CAGAGCAAAC ACGGGCATTG ATACTATCCT GACAATGGAA ACCCAGCGGT TACCGGAACC 

10441 TC CGTTGGG A GAAGGCTTCT TTGCCAACTT TGTTCTGCCT AAATATG AC C CTG CTG AA C A 

10501 TGGCGATGAG CGGTGGTTTA AAATC CATAT CGGGAATGTT GGCGGTAACA CGGGAAGGCA 

10561 GCCTTATTAC AG CGG AATG T TATCCGATAC GTCGG AAACC AGTATGACAC TGTTTGTCCC 

10 621 TTATGCCGAA GGGTATTACA TGCATGAAGG TGTCAGATTG GGGG TTGGAT ACCAGAAAAT 

10681 TACCTATGAC AACACTTGGG AATCTGCTTT CTTTTATTTT GATGAGACAA AACAGCAATT 

10741 TGTATTAATT AACGATGCTG ATCATGATTC AGGAATGACG CAACAGGGGA TCGTGAAAAA 

10801 TAT CJ AAG AAA TACAAAGGAT TTTTGAATGT TTCTATCGCA ACGGGCTATT CCGCCCCGAT 

10861 GG ATTTCAAT AGTGCCAGCG CCCTCTATTA CTGGGAATGT TCTATTACAC CCCGATGATG 

10921 TGCTTCCAGC GTTTGCTACA GGAAAAACAA TTCGACGAAG CCACA CAATG GATAAACTAC 

109 81 GTCTATAATC CCGCCGGCTA TATCGTTAAC GGAGAAATCG CCCCCTGGAT CTGGAACTGC 

11041 CGG CCG CTGG AAGAGACACT C CTGG AATG C CAATCCGTTG GATGCCATTG ATCCGGATGC 

11101 CGTCGCACAA TATGACCCGA CACACTATAA AGTTGCCACC TTTATGCGCC TGTTGGATCA 

11161 A CTT ATT CTG CGCGGCGATA TGGCCTATCG CGAACTGACC CG CGATGCG T TGAATGAAGC 
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11221 CAAGATGTGG TATGTGCGTG CTTTGGAATT GCTGGGTGAT GAGCCGGAGG ATTACGGCAG 

11281 CCAACAGTGG GCCGCACCGT CTCTTTCCGT GGCGGGCAAC CACACTGTGC AAGCGGGCTA 

11341 TCAACAAGAC CTTACGGCGC TAGACAACGG AGAAGGTTGC ACTCAACCCC GCAACGCTAA 

114 01 CTCGTTGGTG GTTTGGTCCT GCCGGAATAT AACCCGGAAT CAACCGATTA CTGGCAAACC 

114 61 TGCGTTTGCG CCTGGTTAAC CTGCGCCATA ATCCTTCCAT GACGGGCAAC CGTTATCGCT 

11521 GGCGAATTAC GCGAGCCTAC GATCCGAAAG CGCTGCTCAC CAGTATGGTA CAGCCTTCTC 

11581 AGGGCGGTAG TGCAGTGCTG CCCGGCACAT TGTCGTTATA CCGCTTCCCG GTGATGCTGG 

11641 AGCGGGCCCG CAATCTGGTA GCGCAATTAA CCCAGTTCGG CACCTCTCTG CTCAGTATGG 

11701 CAGAGCATGA TGATG CCG AT GAACTCACCA CGTTGCTACT ACAGGAGGGT ATGGAACTGG 

11761 CGACACAGAG CATCCGTATT CAGCAACGAA CTGTCGATGA AGTGGATGCT GATATTGCTG 

11821 T A TTGG CAG A GAGCCGCCGC AGTGCACAAA ATCGTCTGGA AAAATACCAG CAGCTGTATG 

11881 ACGAGGATAT CAACCACGGA GAACAGCGTG CGATGTCACT GTTTGATGCG GCGGCAGGTC 

11941 AGTCTCTGGC CGGGCAGGCG CTCTCAGTAG CAGAAGGGGT GGCTGACTTA GTTCCAAACG 

12001 TGTTCGGTTT CGCTTGTGGC GGCAGTCGTT GGGGGGCAGC ACTGCGTGCT TCCGCCTCCG 

12061 TGATGTCGCT TTCTGCCACA GCTTCCCAAT ATTCCGCAGA CAAAATCAGC CGTTCGGAAG 

12121 CCTACCGCCG CCGCCGTCAG GAGTGGGAAA TTCAGCGTGA TAATGCTGAC GGTGAAGTCA 

12181 AACAAATGGA TGCCCAGCTG GAAAGCCTGA AAATACGCGG CGAAGCAGCA CAGATGCAGG 

12241 TGGAATATCA GGAGACCCAG CAGGCCCATA CTCAGGCTCA GTTAGAGCTG TTACAGCGTA 

123 01 AATTCACAAA CAAAGCGCTT TACAGTTGGA TGCGCGGCAA GCTGAGTGCT ATCTATT AC C 

12361 AGTTCTTTGA CCTGACCCAG TCCTTCTGCC TGATGGCACA GGAAGCGCTG CGCCGCGAGC 

12421 TGACCGACAA CGGTGTTACC TTTATCCGGG GTGGGGCCTG GAACGGTACG ACTGCGGGTT 

12481 TGATGGCGGG TGAAACGTTG CTGCTGAATC TGGCAGAAAT GGAAAAAGTC TGGCTGGAGC 

12541 GTGATGAGCG GGCACTGGAA GTGACCCGTA CCGTCTCGTT GGCACAGTTC TATCAGGCCT 

12601 TATCATCAGA CAACTTTAAT CTGACCGAAA AACTCACGCA ATTCCTGCGT GAAGGGAAAG 

12661 GCAACGTAGG AGCTTCCGGC AATGAATTAA AACTCAGTAA CCGCCAGATA GAAGCCTCAG 

12 721 TGCGATTGTC TGATTTGAAA ATTTTCAG CG ATACCCCGGA AAGCTTTGGC AATACCCGTC 
127 81 AGTTGAAACA AGTGAGTGTC ACCTTGCCGG CGCTGGTTGG TCCGTATGAA GATATCCGGG 
12841 CGGTG CTGAA TTA CGGCGGC AGCATCGTCA TGCCACGCGG TTGCAGTGCT ATTGCTCTCT 
129 01 CCCACGGCGT GAATGACAGT GGTCAATTTA TGCTGGATTT CAACGATTCC CGTTATCTGC 
12961 CGTTTGAAGG TATTTC CGTG AATGACAGCG GTAGCCTGAC GTTGAGTTTC CCGGATGCGA 

13 021 CTGATCGACA GAAAGCGCTG CTGGAGAGCC TGAGCGATAT CATTCTGCAT ATCCGCTATA 
13 081 CCATTCGTTC TTAATTAAAA CATTGTGATA GGCAGGCTCC TGAGGGAGCC TGTTTAAGGA 
13141 GTTTTTATGC AGGGTTCAAC ACCTTTGAAA CTTGAAATAC CGTCATTGCC CTCTGGGGGC 
132 01 GGATCACTAA AAGGAATGGG AGAAGCACTC AATGCCGTCG GAGCGGAAGG GGAGCGTCAT 

132 61 TTTCACTGCC CTTGCCGATC TCTGTCCGGC GTGGTCTGGT GCCGGTGCTA TCACTGAATT 

133 21 ACAG CAGTAC TGCTGGCAAT GGGTCATTCG GGATGGGGTG GCAATGTGGG GTTGGTTTTA 
133 81 TCAGCCTGCG TACCGCCAAG GGCGTTCCGC ACTATACGGG ACAAGATGAG TATCTCGGG C 
13441 CGGATGGGGA AGTGTTGAGT ATTGTGCCGG ACAG CCAAGG GCAACCAGAG CAACGCACCG 
13501 CAACCTCACT GTTGGGGACG GTTCTGACAC AGCCGCCTAC TGTTACCCGC TATCAGTCCC 
13561 GCGTGGCAGA AAAAATCGTT CGTTTAGAAC ACTGGCAGCC ACAGCAGAGA CGTGAGGAAG 
13621 AGACGTCTTT TTGGGTACTT TTTACTGCGG ATGGTTTAGT GCACCTATTC GGTAAGCATC 

13 6 81 AT CATGCACG TATTGCTGAC CCGCAGGATG AAACCAGAAT TGCCCGCTGG CTGATGGAGG 
13741 AAACCGTCAC GCATACCGGG GAACATATTT ACTATCACTA TCGGGCAGAA GACGATCTTG 
13801 ACTGTGATGA GCATGAACTT GCTCAGCATT CAGGTGTTAC GGCCCACCGT TATCCTGGCA 
13861 AGTCCACTAT GGCAATACTC AGCCGGAAAC CGCTTTTTTC GCGGTAAAAT CAGGTATCCC 
13921 TGTTGATAAT GACTGGTTGT TTCATCTGGT ATTTGATTAC GGTGAGCGCT TATCTTCGCT 
13981 GAACTCCGTA CCCGAATTCA ATGTGTCAGA AAACAATGTG TCTGAA AAC A ATGTGTCTGA 
14041 AAAATGGCGT TGTCGTCCGG ACAGTTTCTC CCGCTATGAA TATGGGTTTG AAATTCGAAC 
14101 CCGTCGCTTG TGTCGCCAAG TTCTGATGTT TCATCAGCTG AAAGCGCTGG CAGGGGAAAA 
14161 GGTTG CAG AA GAAACACCGG CGCTGGTTTC CCGTCTTATT CTGGATTATG ACCTGAACAA 
142 21 CAAGGTTTCC TTGCTGCAAA CGGCCCGCAG ACTGGCC CAT GAAACGGACG GTACGCCAGT 

142 81 GATGATGTCC CCGCTGGAAA TGGATTATCA ACGTGTTAAT CATGGCGTGA ATCTGAACTG 

143 41 GCAGTCCATG CCGCAGTTAG AAAAAATGAA CACGTTGCAG CCATACCAAT TGGTTGATTT 
14401 ATATGGAGAA GGAATTTCCG GCGTTACTTT ATCAGGATAC TCAGAAAGCC TGGTGGTACC 
14461 GTGCTCCGGT ACGGGATATC ACTGCCGAAG GAACGAATGC GGTTACCTAT GAGGAGGCGA 
14521 AACCACTGCC ACATATTCCG GCACAACAGG AAAGCGCGAT GTTGTTGGAC ATCAATGGTG 
14581 ACGGGCGTCT GGATTGGGTG ATTACGGCAT CAGGGTTACG GGGCTACCAC ACCATGTCAC 

14 641 CGGAAGGTGA ATGGACACCC TTTATTCCAT TATCCGCTGT GCCAATGGAA TATTTC CATC 
147 01 CGCAGGCAAA ACTGGCTGAT ATTGATGGGG CTGGGCTGCC TGACTTAGCG CTTATCGGGC 
14 7 61 CAAATAGTGT ACGTGTCTGG TCAAATAATC CGGCAGGATG GGATCGCGCT CAGGATGTTA 
14821 TTCATTTGTC AAATAAGCCA CTGCCGGTTC CCGGCAAAAA TAAGCGTCAT CTTGTCGCAT 
14881 TCAGTGATAT GACAGGCTCC GGGCAATCAC ATCTGGTGGA AGTTACGGCA AATAGCGTGC 

14 941 GCTACTGGCC GAACCTGGGG CATGGAAAAT TTGGTGAGCC TCTGATGATA ACAGGCTTCC 

15 001 AAATTACGGG GAAACGTTTA ACCCCCACAG ACTGTATATG GTAGACCTAA ATGGCTCAGG 
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15 061 CACCACCCGA TTTTATTTAT GCCCGCAATA CTTACCTTGA ACTCTATGCC AATGAAAGCG 

15121 GCAATCATTC TGCTGAACCT CAGCGTATTG ATCTGCCGGA TGGGGTACGT TTTGATGATA 

15181 CTTGTCGGTT ACAAATAGCG GATACACAAG GATTAGGGAC TGCCAGCATT ATTTTGACGA 

152 41 TCCCCCATAT GAAGGTG CA G CACTGGCGAT TGGATATGAC CATATTCAAG CCTTGGCTGC 

153 01 TGAATGCCGT CAATAACAAT ATGGGAACAG AAACCACGCT GTATTATCGC AGCTCTGCCC 

153 61 AGTTCTGGCT GGATGAGAAA TTACAGGCTT CTGAATCCGG GATGACGGTG GTCAGCTACT 
15421 TACCGTTCCC GGTGCATGTG TTGTGGCGCA CGG AAGTGCT GGATGAAATT TCCGGTAACC 

154 81 GATTGACCAG CCATTATCAT TACTCACATG GTGCCTGGGA TGGTCTGGAA CGGGAGTTTC 
15541 GTGGTTTTGG GCGGGTGACG CAAACTGATA TTGATTCACG GG CGAGTGCG ACACAGGGGA 
156 01 CACATGCTGA ACCACCGGCA CCTTCGCGCA CGG TTAATTG GTACGGCACT GG CGTACGGG 

156 61 AAGTCGATAT TCTTCTGCCC ACGGAATATT GGCAGGGGGA TCAACAGGCA TTTCCCCATT 
15721 TTACCCCACG CTTTACCCGT TATGACGAAA AATCCGGTGG TGATATGACG GTCACGCCGA 

157 81 GCGAACAGGA AGAATACTGG TTACATCGAG CCTTAAAAGG ACAACGTTTA CGCAGTGAGC 
15 841 TGTATGGGGA TGATGATTCT ATACTGG CCG GTACGCCTTA TTCAGTGGAT GAATCCCGCA 
15901 CCCAAGTACG TTTGTTAC CG GTGATGGTAT CGGACGTGCC TGCGGTACTG GTTTCGGTGG 
15961 CCGAATCCCG CCAATACCGA TATGAAGGGG TTGTTACCGA TTCCACAGTG CAGCCAAAAG 
16021 ATTGTCCTTA AATATGATGC GTTAGGATTT CCGCAGGACA ATCTTGAGAT TGCCTATTCG 
16081 AGACGTCCAC AGCCTGAGTT CTCGCCTTAT CCGGATACCC TG CCCGAAAC ACTTTTCACC 
16141 AG CAGTTTCG A CG AACAGCA GATGTTCCTT CGTCTGACAC GCCAGCGTTT TTCTTATCAC 
16201 CAT CTGAATC ATGATGATAA TACGTGGATC ACAGGGCTTA TGG ATACCTC ACGCAGTGAC 
16261 GCACGTATTT ATCAAGCCGA TAAAGTGCCG GACGGTGGAT TTTCCCTTGA ATGG TTTTCT 
16321 GCCACAGGTG CAGGAGCATT GTTGTTGCCT GATGCCGCAG CCGATTATCT GGGACATCAG 
16381 CG TG TAG CAT ATACCGGTCC AGAAGAGCAA CCCGCTATTC CTCCGCTGGT GGCATACATT 
16441 GAAACCGCAG AGTTTGATGA ACGATCGTTG GCGGCTTTTG AGGAGGTGAT GGATGAGCAG 
16501 GAGCTGACAA AACAGCTGAA TGATG CGGGC TGGAATACGG CAAAAGTGCC GTTCAGTGAA 
16561 AAGACAGATT TCCATGTCTG GGTGGGACAA AAGGAATTTA CAGAATATGC CGGTGCAGAC 
16621 GGATTCTATC GGCCA TTGGT GCAACGGGAA ACCAAGCTTA CAGGTCAAAC GACAGTGACG 
166 81 TGGGATAGCC ATTACTGTGT TATCACCGCA ACAGAGGATG CGGCTGGCCT GCGTATGCAA 
1674 1 GCGCATTACG ATTATCGATT TATGGTTG CG GATAACACCA CAGATATCAA TG AT AA CT AT 
16801 CACACCGTGA CGTTTGATGC ACTGGGGACG G7AACCAGCT TCCGTTTCTG GGGG A CTGAA 
168 61 AACGGTGAAA AACAAGGATA TACCCCTGCG GAAAATGAAA CTGTCCCCTT TATTGTCCCC 
16921 ACAACGGTGG ATGATGCTCT GGCATTGAAA CCCGG CA7AC CTGTTGCAGG GCTGATGGTT 
16961 TATGCCCCTC TGAGCTGGAT GGTTCAGGCC AGCTTTTCTA ATGATGGGGA GCTTTATGGA 
17041 GAGCTGAAAC CGG CTGGG AT CATCA CTGAA GA7GG77ATC TCCTGTCGCT TG CTTTTCG C 
17101 CGCTGGCATC AAAATAACCC TGCCGCTGCC A7GCCAAAGC AAGTCAATTC ACAGAACCCA 
17161 CCCCATGTAC TGAGTGTGAT CACCGACCGC 7ATGATGCCG ATC CGGAACA ACAATTACGT 
17221 CAAACGTTTA CGTTTAGTGA TGGTTTTGGG CG AAA CCTT A CAAACAGCCG TACG CCATG A 
17281 AAGTGGTGAA GCCTGGGTAC CTGATGAGTA TGGAGCCAAT GTGGCTGAAA ATCAAGGCGC 
17341 CCCTGAAACG GGCGATTACA AATTTCCCCT TGGGCAATTT CCCGGACGTA CAGAATATTA 
17401 A CGGG AAAAG GCAAAGCCCC TGCGTTACGT 7TCAAACCGT ATTCCTGAAA TAATTTGGGC 
17461 AACTATGTCA AGTTGACCAA AAAATGCCCG GCAGGATATG TATGCCGATA CCCATTACTA 
17521 TGATCCGTTG GGGCGTG AAT ATCAGGTTAT CACGCCAAAG GCGGGTTGCG TCGATCCTTA 
17581 TTCACTCCCT GGTTTGTGGT GAATGAAGTT GAAAATGACA CTCCCGGTGA ATGACAGCAT 
17641 AAAGCTCAGT GATGCCTGTT CA CTGAA CAG ACATCACTCC ATTTAGGAAT GAATCATGAA 
17701 GAATTTCGTT CACAGCAATA CGCCATCCGT CACCGTACTG GACAACCGTG GTCAGACAGT 
17761 A CG CG AAATA GCCTGGTATC GGCACCCCGA TACACCTCAG GTAACCGATG AACG CATCA C 

17 821 CGGTTATCAA TATGATGCTC AAGGATCTCT GACTCAGAGT ATTGATCCGC GATTTTATGA 
178 81 ACGCCAGCAG ACAGCGAGTG ACAAGAACGC CATTACACCC AATCTTATTC TCTTGTCATC 
17941 ACTCAGTAAG AAGGCATTGC GTACGCAAAG TGTGGATGCC GGAACCCGTG TCGCCCTGCA 
18001 TGATG TTGCC GGGCGTCCCG TTTTAGCTGT CAGCGCCAAT GGCGTTAGCC GAACGTTTCA 
18061 GTATGAAAGT GATAACCTTC CGGG A CG ATT G CT AA CG ATT ACCGAGCAGG TAAAAGGAGA 
18121 GAACGCCTGT ATCACGGAGC GATTGATTTG GTCAGGAAAT ACGCCGGCAG AAAAAGGCAA 
18181 TAATTTGGCC GGCCAGTGCG TGGTCCATTA TGATCCCACC GG AATGAATC AAACCAACAG 
1B241 CATATTGTTA AC CAG CAT AC CCTTGTCCAT CACACAGCAA TTAGTGAAAG ATGACAGCGA 
18301 AGCCGATTGG CA CGG T ATGG ATGAATTTGG CTGGAAAAAC GCGCTGGCGC CGGAAAGCTT 
18361 CACTTCTGTC AGCACAACGG ATGCTAC CGG CACGGTATTA ACGAGTACAG ATGCTGCCGG 
18421 AAA C AAG C AA CGTATCGCCT ATGATGTGGC CGG TCTGCTT CAAGG CAGTT GGTTGGCGCT 
18481 GAAGGGGAAA CAAGAACAAG TTATCGTGAA ATCCCTGACC TATTCGGCTG CCAGCCAGAA 
18541 GCTACGGGAG GAACATGGTA ACGGGATAGT GACTACATAT ACCTATGAAC CCGAGACGCA 
18601 ACGAGTTATT GGCATAAAAA CAGAACGTCC TTCCGGTCAT GCCGCTGGGG AGAAAATTTT 
18661 ACAAAACCTG CGTTATGAAT ATGATCCTGT CGG AAATGTG CTGAAATCAA CTAA TGATG C 
18721 TGAAATTACC CG CTTTTGGC GCAACCAGAA AATTGTACCG GAAAATACTT ACA CCTATGA 
18781 CAGCCTGTAC CAG CTGGTTT CCGTCACTGG GCGTGAAATG GCGAATATTG GCCGACAAAA 

18 841 AAA C CAGTT A CCCATCCCCG CTCTGATTGA TAACAATACT TATACGAATT ACTCTCGCAC 
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TTACGACTAT 
AACTATACAA 
GCGCAAGATC 
GTTCCCGGTC 
AATAGGGAAA 
GTCATTAAGA 
TTGCCAGAGC 
GTCATCACTG 
AAACCGGCGG 
AGCGGGCTGG 
TGGGGGAACC 
TATTCTGGCA 
TCGTGGACAG 
TTCCGAATGT 
CAGGGTGTCC 
GAACACCTGC 
CGAACGTTTG 
AG CGTCG CCT 
GGGGTTTTTC 
TAAACGTTCT 
ACTATTTAAC 
AGGATTAATG 
CG CCGG ACAA 
AGGGCACTAT 
GAGCTGGGTG 
GGTAATCTAT 
AGCTCTGCCA 
AGTGTCGGGA 
GTTGGTGCAG 
GCCAATGCCG 
TTTAACGCCT 
TGTCATGG AT 
GTCCAGTTAA 
CAGGCTGAGC 
GCCTGTATCG 
CTGGAATTGA 
CA7CATTGTA 
GGCAGGTAAC 
ATCGGTAGCA 
GTTCAAACCG 
TTTATCTTTT 
GGTTCTGGTG 
TAAATTCAG C 
TGG TTTAAT A 
TAAACAGAGT 
GA GCC AGTTG 
AGTTTGTGCC 
AGCCAGCGTC 
AATTTCCCAC 
AATACGTGTT 
AAAACGGGAA 
GGCCATATGC 
CGCTTGTTTT 
AGACTGTGCA 
GACCTTATCC 
TTCAG CCAAG 
TATCAACTGG 
CATCACTGCA 
CATGGACGGA 
CAGGTTATGG 
TTTATTAATT 
ATGCAGCGCG 
TGGCAGCTTC 
CACCCAATAA 



GATCGTGGGG 
CGAACATGAC 
CCACTCAGGT 
AGGATCTTTT 
ATACGACGCC 
CTCATATTCA 
TGGAATGGCG 
TCGGTGAAGC 
ATATCAGCAA 
AATTGGGACA 
GCCGTGTGGG 
AAGAG CGGG A 
GGCGATGGTT 
GCAGG AATAA 
TTG CCTGG AT 
TTGAACAAGG 
TTTTGGGTGT 
TGGGGGATCG 
GCGAACAACA 
GCTCCTGTTC 
AGCTCTTCGA 
GCTTTAG CCG 
AGTACG CTGG 
CAGGCGCAAT 
AACGGGCAGC 
GGGATGGCCC 
TTTCCCACGC 
GAAATATTTC 
CCATTGGCGG 
CTAGCCGGGT 
CTGCACGTCA 
GACAAGGTGG 
TTTTTGGATC 
AATAAG CTTT 
GCCACAGGAA 
CCACTGTCAT 
CTGCCGCCAT 
GTCACACTGA 
ATATT C AG AT 
TTAAG CGTTG 
AAAATGAAAC 
ACCTCCAGTG 
ATCAGGGTTT 
AAGTGTG CTG 
GAG AC CG CCA 
TATAG CG CTG 
TGCTGAGTTT 
GCTAATTGAG 
TCTTG CCG AC 
GCTGACGCAG 
CCTCCCACAG 
AGGGCTGTGC 
TCACCAGCGT 
CCATGACGGC 
TGCATTTTAA 
GCTTCTGCAT 
CTTACCCCCC 
TGAGGTAAAT 
TCTGCGGGCG 
CGTAAGTTAT 
TGAGGGAGGA 
CTGACGCAGT 
TTCCAGCCGT 
AGTGAGCGCC 



GAATCTGACC 
CGTTTCAGAT 
GGATATGTTG 
CTGGACACCC 
TGATCAGGAA 
GAAGACAGGT 
CACGACATAT 
GGGTCAGGCA 
TGATCAGCTG 
GTGACGGGCA 
CACCCGAAAT 
TGCAACAGGG 
GAGTGTAGAT 
CCCCATCGTT 
AGGGAA AAAA 
CGCTTCCTTT 
GGGGGTACAA 
TCGGGGCTGC 
TCTCAGAAAA 
AGGTAGGCGC 
CAGGTACCGC 
GAGAACATAA 
ATACGCTCAG 
TATTGGCGGC 
GATTGGTGCT 
TTATCGGTTT 
TGTCAGTTCC 
TGAAGTATTA 
GACAGCCGCG 
TA CCTGG AG C 
TAATGAATCC 
GTTTTTCGGA 
AAG AA CG AAT 
TCTGTTT A CC 
GCCCTTCAAA 
TCATG CCATG 
AACTCAGTAT 
TTTGTTTGAT 
CCGATAATTT 
TG CCTG CACT 
TATTTTCTGT 
CCCGTTCATC 
CACCCGCTAA 
CCG CATT ATT 
AATCATAAAA 
CATTACTGAA 
CCAGATAGTT 
CATCAATTTG 
GGCGACGGTA 
AAATTTCGAT 
CAAAACCGTA 
CGCTGGTGCT 
TAACATCTTC 
TTTCTTGAAG 
TACTTTGCAG 
CCTGCCGTTC 
ACTTGGCATT 
CGCCGCCGCC 
TGGCATAGAG 
AGAGG CGTTG 
ACAATGCGGT 
TGCAGCATTT 
GGCTCTGACC 
TGTACATACC 



AGAATCGCAT 
CACAGCAACC 
TTCACCCCCG 
CGTGACGAAT 
TTCTACCGTT 
AACAGTG AG C 
AGCGGCAATA 
CAAGTGCGGG 
CG CTACAGTT 
GATCATTAGT 
CAGTCAGAAG 
TTG T ATT A CT 
CCTG CCGGTG 
TTTTCTGATT 
G CGTATCG AA 
GATACGTTCT 
GTCTGGGGGT 
CATTGGTGGT 
AATTGGGGAA 
TTTTGTTGTC 
CATTTCCGCA 
CACGGGCATG 
GCCCGGTAAT 
ATATTACTTG 
ATGTATGGTG 
ATCGGCAGGT 
AGGAGCTGGT 
TTACCTTATA 
GCCGCTCATC 
G G CTTT AAG C 
GAAGCATAAC 
TGTGTGGACA 
GGTGTAACGG 
ACTGATACCG 
CAGGTAC 
TGAGATCACA 
TGCCCGGACA 
ACGGCGTGTA 
GAGGCTGGCT 
GCCTTCACCT 
CAGACCAGCA 
TTTTTCCAAA 
TAAACCCGCA 
CAATTCATAC 
CTGATAATAA 
TTTACTTTGC 
TTTTTGTAAT 
TTTTATCTCA 
TATTTCTGAT 
ACCAATCGCA 
AATATTGGGG 
CAAGACCGAT 
GTCGTACAGC 
CGCCAATTTA 
GGCTAACTCA 
AGTAATGCTG 
TTCCAGAATC 
TTGTGAAGCA 
AG ATAATG A C 
CGTCAATGTC 
TAACGAAATT 
TATGTTGATA 
AATCGTTATC 
ACATTTTAGC 



AATTCACGAT 
GGGCTGTACT 
GCGGGCATCA 
TGCAACAAGT 
ATGATGCAGA 
AAATACAG CG 
CATTAAAAGA 
TGCTGCATTG 
ATGGCAACCT 
CAGGAAGAAT 
CTGATTACAC 
ACGGCTATCG 
AGGCCGATGG 
CTGATGGTCG 
AGGCAGTCAA 
TGAAATTAAA 
GAAGCGGCCA 
TTTGTCTCCG 
GTTTTAAGTT 
ACATCG CTTG 
GCAACAGCGG 
G CTATCAGT A 
GTCAGCGCGC 
GCCGCCATCA 
CTCGATGGGG 
TACTGCTCAG 
TTGGCCGAAT 
GCCGTACACC 
ATGCCGTTGG 
GGGCTTTTAA 
AATCATGTTC 
GAGACCCGTA 
ATATGCAAAA 
GGAAAACTGA 
TTAGCATCAT 
ATCGCTTTGC 
TCCTG ATAAG 
TTACCTAAAC 
TG CAG TTG TG 
G CATTG ACTA 
TACACTTCAG 
TAGCTTTTTT 
TAAGTC CCAT 
TGATAAGTTT 
ATAGCGGACA 
AGAAAGGCTA 
ACTGCCGCTT 
GCTTCCGCAT 
TGG CTG ATTT 
CTGGCATTGA 
A CG AG ATCTG 
GAAGAGAGGT 
GTATTGAAAC 
TCAGCATCAA 
CTG CCTTG AG 
AG CAGGGTAT 
ACCGGAAAAC 
GTGATGGCAG 
AGTGG CTG A C 
TG C CAGTAAC 
TGCCGTACGT 
ATGATGCCGC 
CAATGAAAAA 
TTCGTTTAAG 



CACCGGTAAT 
GG AAG AG CTG 
GACCCGGCTT 
GATATTGGTC 
CAGTCAGCGT 
AACATTATAT 
GTTTTTGCAG 
GGAAACAGGC 
GATTGGCAGT 
ATTACCCCTA 
AAGCCGGCGT 
TTATTATCAA 
TCTCAATTTG 
TTTCCCCGGT 
CATCACGACA 
CCGAGGATTG 
CGATTGCAGG 
GGGCGGTGAT 
ATCTG ACGCG 
TGACGTCTGC 
TCACCGTTGG 
TTGCCACACC 
CAGAGCGGTT 
GGG AAGTTCT 
AAGGATCATT 
AAG AGG CATT 
GATAGGAGAA 
CGGTGAATGG 
AGGGGAAGTT 
TAACTTCTTC 
ATTCCCACTT 
CAGGGTCTCT 
TGATATCGCT 
GGG TTAATGT 
TG AAAT C CAT 
AGCCACGTGG 
GCCCTAAAAG 
CGTCAGGATA 
TCCCTTCGAC 
ACTCAGTCAC 
CCAGAGAAAC 
CCATCTGTGC 
GCCAAGCACC 
G CTCTG CCA T 
ACGTTCCACG 
ACTGCGCCTG 
CACGACGTAC 
TATTGCGCTG 
TGTCTGCGGC 
AAAGCGCCCC 
CCGCGGCGGC 
AAAGATCCAT 
TGT CAAAACG 
TTTCAGCCAT 
TTTGCAGTAT 
TGCCAAATTG 
GGTACATCGG 
CACTGAGTAA 
CGTCG ATTGT 
CTTGCAGTTT 
TTCGTGGGTA 
ATTGTTTGGC 
TAAGG CTCAT 
GTATCACGTT 
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Fig.2. 

CAAGCTGGCG 
TAG C CACTTT 
GCGGATTGGC 
GCATTTCACC 
TCCGGTAAAA 
CAATCCCAAG 
TCATGACACG 
TGTCATTCCC 
AAAATAAAAG 
TAATCTGACA 
GGTTTCAATA 
ATTCCCAATA 
CAACTTAAAT 
CATACTTAAA 
AGTATTTAAC 
ATG CAGTGAT 
AATAACAATA 
AATATCATAG 
TTCGTGTGAA 
GCCATTTTTT 
AAGTTTTATT 
AGTGCCGCAA 
ATAGTATGCC 
AATTCAGCAT 
GGAATTTTTT 
TCGATGGAAA 
ATTTCATCAT 
AGG TATTTAA 
TTGAACGTTA 
CCGGTAAAAC 
TCTGTAGCAA 
CTCTTAACAA 
ATG GTCATCC 
AAATAAACAA 
T TTTTATTG A 
CTG CCATCAT 
TCTTGA CTTT 
TAAATAACAG 
TCTCCCCAGG 
CAATAATATA 
ACATTAATAC 
GTCAGATAAT 
TGCTGTAACA 
TTTT CCGGAT 
GTTTCCCAGT 
TTATTCAACG 
GTTTTCACTT 
CTCTTAATCT 
GCTTCATCCA 
AAAGTACTGG 
TATTTTAATT 
TCCAGCCATT 
ACCGGTCTGT 
AATTGTTCGG 
TCACAACGCA 
TGCAGTGCTG 
TTTTCGCTGA 
GCCACCATGT 
TCATCGAATG 
GTTTTGGCTG 
TCGTCATCAC 
ATATCCGGCG 
GACCATTTCT 
AATG CTG TAT 



ATAGGCGCTA 
ATAGTG CATC 
ATTCCAGGAC 
CTGAA CCGAA 
TCTGCTCTTG 
AAATAGATTG 
ACTTGAATAC 
GAATCATGAT 
AAAGCAGATT 
CCTTCACGTA 
TCAGATAACA 
TGGATCTTAA 
GTCGCATAAA 
ACATTATCAA 
CGGGTTCTGT 
ATAACGTTAC 
TCGTTATAAC 
CCTTCAATAT 
GGTTTAGATG 
AATAAAAAAC 
ATCTGTGGCT 
TATTTCCCAT 
ATACTCCTGT 
CAT CTG ATT C 
CCTTGGTTCT 
TAATAAAATC 
GTCTATCATA 
CCTCATCATT 
AATTAATATG 
TGG CTAATTT 
ATTGATTGTT 
TGGCGTCTAA 
CTTCTCTTGC 
TAATGACGTC 
TCAGGTTTTC 
AACGAATATG 
CATTTTTCAG 
GTCTGATATT 
CATTGGCAGC 
TTCTGGGTTC 
TGTCATGATA 
TTTTAAAGCT 
GGTTGTTCAT 
AATAGG CCAG 
CGCAGAAGAA 
CCCGGTTGAC 
GGGCAGAAAC 
GTTGAGGTGC 
GCCATGCCTG 
CGGCTTGCCA 
TTATGAGTGC 
GCAGAGTGAC 
TGCAATGCTT 
CAGTCAACGC 
TGATCACAGC 
TGGTTTCTGA 
GTCCAATATT 
TG CTGGTTTC 
TCAGTCCTTG 
AATCCATTTG 
CGAGTGAAAG 
TAAGGACAGT 
GTGTTGTCAG 
CAGAAAAAAG 
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TCTCCGCGGG 
GGATCATGCT 
GTATCTTCCT 
TATCCGGTCG 
CAATAAGCGC 
CATTGGCGCC 
CCCTTTTATA 
CGG CATCATT 
CCCAGGATTT 
TGTAATATCC 
TTCCTTCGTA 
ACCAACGTTC 
ACCCTTCACC 
TACCAATATT 
AAGGGCCAAT 
TTGTATCTTT 
CGCCGTCGGG 
CAACTTTTAC 
CCACATGGTC 
TAATGTTTTT 
GGTTGAACAT 
GTTATTAATG 
GTTATCTTTC 
ATAATCATAA 
TAGATGCATT 
CAAAGTTCCA 
ATCAAATAAA 
ATATATATTG 
ACCATTTCCT 
ATTTTTTGTG 
GACTTTGTAT 
ATCATTTTCT 
AGGAAGACTA 
TTTTTCATAA 
TATTTTATCA 
TGACAGTTTT 
CTCTTTTTGT 
TTCCTGCCAT 
AAATTGACCG 
TGTCTGGCTA 
TCCGCTAATC 
ATCTTCAACG 
CATACCTGTC 
TTCAGATACG 
CTGACGGGTT 
ATATAACTGA 
TTGGTTATCA 
ACCATTTTTG 
AAGCTGGTCG 
ATCATCAAAT 
AGCAACACCA 
ATCTATAAGT 
GTGTCACAAC 
TCCTAAGTTC 
ATGGAAGCGG 
TTGGAATTTC 
GCGCACAATC 
ATT CTCT GAG 
TGGTTTTATC 
AATGCTGGCA 
TGTTGATAAT 
GCTGTAATTA 
CCACTGGGTG 
GGCAATTTTC 



TAATCAACAA 
GGGCAACGGC 
CCAATGGGCG 
GGTTCAGATA 
TGGAATACCA 
GTTTGAAATC 
TTTTTTGATA 
AGTGAATATA 
GTCATAGATA 
TTTAGCATAG 
AT AAGGTTG T 
ATCACCATGC 
TAATTGCGGC 
GGCTCTTTCA 
CTGCATATAT 
GGATTTTAGT 
TTGCTTAATA 
TTGATTAAAA 
TTCAGCATTT 
ATCTTGGATC 
AAATACACCC 
ATTGAAACAT 
CAATCTAATA 
TTTATACCAA 
AACACTCTAA 
TAATGAAAAA 
ATAACCGTTT 
CC TTTTG AAA 
GGTGATATAT 
GTTATAGATT 
TCTGTCCTGG 
GTGAGAATGG 
TTAAAAGAAT 
TCAGAAGAAC 
GTCACATTAA 
AATATATAAT 
TCCAGCCACA 
ACATTGATGG 
TGCTGGCACT 
TAACCAATTA 
ACCTGCAAGT 
GTATCGATAT 
TGACCAATAC 
CCGGCCCAGG 
TTCACTGGCT 
ATGCTGGCAA 
ATCAG CAG AT 
ATGTAGTAAG 
GATTGTTGAC 
GTTGG CATCG 
TCCGGGGTAA 
TCTCCAGTTG 
CTGAGCATCA 
CAAATGCTGT 
GTCAGCGCTT 
TCCGG TTTTG 
AGAGAAAGTT 
CGATCACGGT 
TG ATTAATC C 
G C AAT CAG CG 
C C ATTACTTA 
TCCGTGGTCA 
CATTGGAACA 
GTGTTCACAT 



ATCCAGCATT 
GTCCGGATCG 
GACGTTCCAG 
TAGCGCAGCC 
TCATGGGCGT 
CATGGGTTCA 
TTTTT TACTA 
AATTGATTTT 
A TTTTTTT GT 
GGAACAAAGA 
CTGGCAGAAT 
TCCTCTTTAT 
TCTGGTAAAT 
GCTAATTTTC 
TGTGTG CCTG 
TTTATATGAA 
ATAAACTCGC 
TCATATACCA 
AACTCCACTA 
TGTTCGATCA 
ATGGATCCTC 
CATTAGTAAA 
CTATGTTAGT 
CTCCAATTTC 
AATATTCGGC 
CTTCTTCTTC 
CAT CTTCTA C 
AATTAATTTC 
ACGAGAGATC 
CCTTATATTC 
TATCAAGTTC 
ATAATGTCAT 
AATTGTCTTT 
AATACATACC 
AATTAAACGG 
CAGTGATATC 
GTAAATACAA 
GTATTTCAAT 
TTTGGTGATC 
AATAAG TG AG 
TAG CGACATC 
TTAACTGACT 
GAATCGTGGG 
TGCTATACCG 
TTGATACTTT 
TGGCTTCTGC 
AG CTGTACAA 
CACTGGCCGC 
TGTTCAGTCC 
GGGTTTCCGG 
TACCCAATGT 
GTAAAGGTAT 
AAATTTTAAC 
TAAGATTCTG 
GCAAAGTGGG 
TCACCAACAG 
GCCCCAGTAC 
TAG C CG CAAT 
ACAGCAAAAT 
GGGCAGCTGC 
GTG TCGTG AT 
TCAGAAACAC 
GAAAGCTGAT 
AGGGAGAAAC 



TTCATAAAGG 

ACCGAATCCA 

TAATAATCCT 

AGCGTGTCGA 

TGTAATAGAA 

GTGTTATTTT 

TCCCCTGTTG 

TCGTCTCATC 

ACCCAACCCC 

GCGTTACTGT 

TG CCATCAAT 

TGTAGGGGGG 

TTTGCGTTTC 

TGGAAAATAA 

ATGGCATTTT 

TTGGCGATTC 

TCA CCAG AGG 

TAGGGTCAGA 

GAATATCAGA 

TAG ATG AAG C 

G CG AAGGAAC 

TGATTCACAT 

ATCAAGTTTG 

TGATTTTCTA 

ATTTTTAAGA 

TTTTCCAAGC 

CATCGATAAC 

CATTGAAGGA 

AAAAATATTT 

GGCCAAATAA 

TGATAATGTG 

ATCAGGGTTA 

TTTCTCATGG 

AATGCTGGCT 

TGAGCTCCAG 

TATCTTGCCA 

ACGAGACTTG 

TTTTTTCCAT 

GACATTGCGC 

CCCCTCATTG 

TTCAAATGCG 

TTGGGAAAGT 

GTCGATATAG 

TCGATTGTAG 

TCCTTCAACA 

CACACGGGTG 

CTCATCCCGG 

TGTCGTCGTG 

CGCCTGCAAC 

TTCACCGACA 

AG CAG CG ACA 

TCACTCCCAA 

GCCACCGCCA 

TCGCGTAGCT 

GAGATCATGT 

GGTCAGTTCG 

CTGACAAAAA 

AAT CATG AAA 

AGTTTCTGCf 

ACGGATCAGT 

AAGGTTTTCA 

ATCACTGACA 

TAATTGCGTT 

CG A CAACAA C 
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26581 ATGGATAATT CATTCACTGT CAGATGATGA ATGTCTGCCA GCAGACGAAC GCGATAAAGC 

26641 AGAGACAGGT TCTCGATGGA ACACATAAAT TCTGGATTTG TTCCGCCATT AGCCAGTTTC 

2 6701 CATAATGTAT ACAGTTCAGT ATCATTCACT CTGAAAGCAC GTTTCATTAT TCCCAAATAA 

26761 AAATGGTTTT TTGATTCACC GGGGGTTAAA TCCAGTTTGG TATTATCAGC AGAAAACTCT 

26821 TGG CCATTTA ATAGCGGTGT ATTGAACAGC ATTGTAAAAT GACTGGGTTG TTGTTTAGTG 

2 6881 GAATATTGGC TGATATCTGA ATGACACAAT ACCAGCGCAT CGCTGACGCT AATATTATAG 

2 6941 TGCTGCATAT AATATTGAAC ATAAAACAGC TTACCCAACA CATTGCTGTC AATGGTTAAG 

27 001 TCATCATAAA TACTTTCTAT TACTTGCCAG ATATCTTCTG GAGATATGCC TGTGGCTTTA 

27061 TACAAACGAA TCGCTTTATT CAGCTTTAAC AGGAATATAT CACCGGGAAC TCCATCATTT 

27121 TAAAGTGTGC ATTGGCATTG ATAGCATCCG ACGGATTTGG TTAACTCGCC ATAAGCGGAG 

2 7181 TGTTATACCG TTGGTGATTT GCTCTGTCGT CAATTTAATG GGAATACTGT AATGGGTATT 

27241 AGCAATGGGG ACGAAATTTT TATCTTGGTA TATATATTCT TTATCTCCAT TCTGGAGACG 

2 73 01 AAAATCCAAG TGGTCAGGTT CTGTTTTTTT TACACTGAAA TTATATTTGT ATTCATTTTC 

2 7361 TTTGATTGGA ATTAG CTCTG CATAGTTTAA ATGTGAATCG TAGAAATCTT TGCGGGTTCG 

2 7421 CTTAATCAAT CTTGCCGTTG CCGTATCATT CCCGTCATTG ACCAATGTTA TCAGTTGCTC 

2 7481 A T T CTT ATAC TGTTGATTTG TATTTTTCTT ACCGAAGGAG AGATTGACAA ATAAACTGAG 

2 7541 TTCATCATAA GACAAATCGT AGTAGCGAGC CAAAGAAGCA TAAC TCTTAA AAATCAGTAC 

27601 ATCATCTGTA CCGAAATTTT TCTTCATCAG TTCTGTTGAA TTTTCCGGTG TAATTTCTTC 

27 6 61 TACAAGGATT TGATACAATT CAGGCGATAT ATCAGTCTTA ATAGC CAGTA GCGATGTTGG 
27721 GTCCATTAAT TCCGCTACGT CTGTATTACG GCTAAATGCG GTGAGGTTTT TATCTTGCAA 
27781 TAAAATTGCC TGACGGGCTG ACTCATACGG CAGATGATAG GGTGTCATGC CGGTTTGCCG 
2 7 841 GTAAGTGGAC AACATTTTCA TTACACCGTT ATAGTCAGTT T TCT CTAACG TCTGAATATT 
2 7 901 ATGCAGCAGT AATTCATTAG ATAAGGATAA TGTGGAAATT TCTTCATCCA TATTATTCTG 
2 7 961 TGTCAGTGCC AGTGAAG CAA TGTCGGGGCG TCGTTTATTC AGGTGATATT GAGAATTGTC 
2 8021 AGGATGAAAA TCTTTCG CTT CCCGATATAA TTCTGTTAAA TAAGCCGCTG GTGAAAATAT 
28081 GG AAG CAATT GATCCCGGTT TTACAAAACG GTGGGCGCGG CCATAAAACC AACTGTTGTA 
2 8141 ACTATTGTTT AGGGTTGACG GTGTAATATT AAGGTTAGTG ATATTAGCCA GTTGTGGATT 
2 82 01 AGCACGGGAC AAAATGCGCA GTTCTTCAAG TTTATTCTGT TTTGATTCCT GATGAGCCTG 
2 8261 TTGATATAAA AAGTCTGTTT CTCGCCACGT CAGAGTTCCA CTTGTCCTAT GACGAAATTC 
28321 GCTGAAAGAC ATAAACGAAA TGTTTGTCAA TAATAAAGTA TCACCAGCCT TTTTCTATTT 
283 81 AT CTT ATCTA ACAGTTCATT AACTTTTATC ATATAAATCC TTAAGTTATT GTCAATTTAA 
2 8441 TGATTAATGG TTTTTA GGTG GAGATTATTA TAATCTGATA GG AATATTAT GGTTAATTAA 
285 01 ATTGATACTG ATTTATCGCT CTATTCTTTC AATAAAAAAT AAAGAACTTC CCTATAATAC 
2 85 61 ATGGATTTAA ATAATGAATA CCGTATGTTA AAAATTAAAT TTTAACAAAC TTTCATGAAA 
2 8 621 AAATTCAACT CAACAATTGT TTAAATATTT TTAATTGTGT TTGTGCTGTT TGAAAAATGA 

28 6 81 ATGACTAATA TTTATCTATG AAAGATTATT TATTGAGGAT GTCTTGCTTG GTTTCAGGGG 
2 8741 GCTACGTTGG AGTCAGATAA ATGTGTGCAA AAAGAAATCC TTAATAAAGT TGCGTAATTA 
28801 CAAAAGTTGG TATATCGTGA CAAGAGTGAT AGTAATGTCA CATAATTTAT TGAATACCCG 
2 8 861 AACCTCGCAA ATGCGGGGTT TTTCTTCGCA TAATCAAAGA GAAAGCTATG AAAAAAACAC 
2 8921 TGATTACTCT TATTCTCAGT ACCCTTTCTT TTGGTGCTTT GGCACAGCAG GGTGGCTTCG 
28981 TTTCCCCGGA CAGCACAGAC TATACTCAGG GTGGATTTAA AGGTCCAACT CCCAACCTGA 
2 9041 CCAGCGTTGC TCAAGCAAAA TCTTTTCGTG ATGATGCGTG GGTTGTTCTG GAAGGAAACA 
29101 TTGTTAAACA GGTTGGTCAC GAACTCTATG AATTCG CGG C CGCATAATAC GACTCACTAT 
29161 AGGGATCGCT TATTACGGAC TTATCCGGAA AGCTATCTGG AACCCCTGTT ACGCCTGAAT 
29221 AAAACAGAAT TCAGGGATAA CAGTGGTTCT GTTTATGTTG ACATTGATGA TAAGCGCTGG 
29281 ATGGGTCTGA CGGCCACTCC AACTGACAAA GTTCGTATCG AAGGTGAAGT GGACAAAGAC 
29341 TGGAACAGTG TTGAAATTGA TGTCAAAACT ATCCGCATAG TGAAATAACT CAAGCACTTT 
29401 GAATATAGCC CCGCACTCGC GGGGTTTTTT GCTTTCTGGG AGTCGGAAGT TTAACCGTAG 

2 9461 TGACGAGGAT CAAAACTAAG TTAACGGCAG TGGTCACTGA TTTGGTGCAT AAGTTATCAA 
29521 AAGTTAAAAA TCAAAACTTA TTTTTTATTT AATAGAGGAA TGTCACCCTG TAGGTGAATA 
29581 ACGTTGACGG ATGTAAATAT ACAGTATTAT AGTCCTTTGA TATGTTATTA AATTGAAAAA 

29 641 CCTTTAAACT ATATTCGGGG GAAATTATTA TGTCAGATGT TCGTAATATT ATTAATGTTG 
297 01 ATAACAATTT TGGTTGTGAA TATAAAGCGG ATTTATTTAA ATAAGTTTTC ATAATTGTGA 
29761 TACACCCATT TTTCTCATCC CCGGTTTTTG CTGTTGTAAG GAAGCGGTTT CCATGAAGAT 
29 821 TTTGACATGG TTAAGCAACT GCCACATAAA TTGGCAGCAG TGGTTTCGTG TCACGGTTTC 

29 8 81 ATG CAAGG AT TG CCATAGAC GTTCAATTTT ATTCAACCAC GGGCAATAGG TCGGTAAAAA 
29941 GAGAAGATTA AATTTGGGAT TCTTTGCCAG CCAAACCCTG ACCTTCCGGC TCTTATGAAT 
30001 GCAATAGTTA TCTAAAATTA ACGTGATGGT TTTGGCATTA ACATATTGAT TGTTAATTTC 

30 061 ATCTAACAAT TTGATAAATA AATCTGAGTT CTTTCTCAAG CTACCGACAT AAGTGATTTC 
30121 TTTCG T T TT C GCGTTGAGGC AATTGGCAAG GTAGTGTTTT TGGTTCTTTC CGGGGGTAAC 
30181 AACACGCTTT TGTTGCCCTT TGAAGCACCA GTCTGCACCG ATTTTCGGGT TCAGGTTGAT 
30241 GTCCACCTCA TCCTCATAGA AGACCGGGTG TTTCTCTTGA GGCATTGGAT AACGTCTCGC 
30301 TGAT TTTTG C CATTTTTTCA TCATACTCAG GGTCAGGCAA TTTTACGGTT GGTGCCGCCC 

3 03 61 TTC3CCAAAC GATGCCCGTC CGGCAAAAGT AGCGATAGAG GGTA CTTTG A GAGAGCGATG 
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TATTCAGTAG 
AG CCAAAATG 
AGTTCCAGAT 
TTAACCAATT 
GAGAAACCGT 
TATCCCGGGT 
TTATGATTGG 
CGCGAAAATC 
GGAGTCAGCA 
TGATTCATCT 
TGG CCATTAT 
GTTTCAACAT 
AATTATGTCT 
CGGGAA CCTT 
AATTAAGTTT 
TTTATGGTTT 
TTATATAGTA 
TAACAGAAAA 
GAG CTGTATT 
GTCAGCAGAT 
AGTAAAGGCG 
AGCCTCACCA 
TTTTATCCCT 
GGAACCGG CA 
GTCGAGACAA 
GATCCGTTAA 
AAAAAAGGAC 
AGGCCTTACC 
CACCTGTATA 
CTGACACATA 
ATGCTGGAGT 
CAGCGCCATG 
TTTGTCCATC 
GAACAGCTTG 
GCTGAAGGAC 
AGTCTGGACA 
CATTAAATGG 
ATTTTATTTA 
TATATAACGG 
CGTG AGCA GC 
CCAATTTCAT 
AAATCCAGTG 
CCAGTTCAAA 
GCAGACAGAA 
TATAGCGGCG 
GGTGAATTTT 
AGTAATCCAT 
TGGGGGGATA 
CGTAGGTCAT 
CCATTTTGGC 
AGGCGATGTT 
GTTCCGGGTC 
CCACCGCAGT 
GCACGTTGCT 
ATACTGAATG 
AATCCCCCCA 
CTGG CATCAC 
TAAGGATCAA 
TCTGCTTGCC 
TCACCGTTGA 
TCCAGATCAA 
TCTGGCG AC A 
AGCGGTTCTG 
GCTGAAGAAT 



CTCATTGATT 

TTGTGGCGAG 

GGCAGGCCTT 

TA CCCAACG A 

ACTCCCTTCA 

TTTCTGGATA 

CATGACTCAG 

GGACTGAGTT 

AATGAGTTAT 

ACCGGTGGTA 

ATCAAAGTTA 

GGCAGTTATG 

GGTGATTCAG 

CCACTGATGG 

ATATTTCATC 

TATATTTAAT 

AATAAATTCT 

TTCATGGTTA 

TACTGTAGAA 

AATATGTTGT 

ATTAATAACC 

TGACGCGTTA 

TTATCTGCCG 

GCTTTGTGGA 

CCCACGGGGA 

TGG CCTGG CG 

ATACTGAACT 

CTTACTCAAA 

ATCAGCCCCT 

AAAGCATTGC 

GGGTTCCCCA 

TTGTG TTAAG 

AACTGACTGA 

AACAAAAAGG 

GGGAAGAAGG 

TCATTGTCAC 

ATACG CTTTT 

CTACGATTTA 

TCCCATATCA 

ATTTGCCAAC 

TTTGGTTGCA 

ACCACCGTCA 

CTGATTTTTC 

GAATTCCGGA 

CGTTGTCAGA 

CTCCGGTTGT 

GTCCCGATCA 

CAGGTTAGTA 

CACAAAGATA 

AACGACGGCG 

CAGTG CTTCA 

GAATTCATTA 

AAACAT GGGA 

CAGGATCTTT 

CGAGTTCCAG 

GTAAACCGGA 

GGCGCTGATC 

CGGGTACAAT 

GGTTCCACCC 

CGGCCATAAA 

AACCACGGCC 

AACGCGCATC 

TATTTTCCGG 

AACTCAAAGG 
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TTAAGTGTAA 

TG CTGTAAT A 

CCCGCCGGGA 

TGAACGGAAG 

TGTAACATCA 

GCTTTTTTCA 

TCCATTTTGG 

CCCTTCAAGT 

TCCCCATAAT 

TGTGGATTCC 

CTTTCAGTAA 

TTTATTTTAC 

CTAAAGGCAA 

TATTGAATAA 

TGGTTTCTGC 

GCCAATCATA 

GTTGGATGTG 

GGAAATTCAA 

CTCG CATTG A 

ATATTGG CTG 

GATAAAACAG 

TTCAAACATT 

GAAGCGATCC 

CAGGCAATTA 

CGGTTACATT 

G CTGATGTAT 

CCCTTTGGTC 

TCGATGGCTG 

GCCGTTGGTG 

CTTGATGGAG 

ATTGGTGGCG 

CTATATTTTA 

ACAATCTCCG 

GCGTGAG CAA 

CAAGCTGGAA 

CAGTACCGGC 

TCACAGCAGG 

CGACGGGTTA 

ATCTTCTCTT 

AGGCCATCAT 

TAAATTCCCT 

G CATTAAAG A 

CCGCGTG CAA 

GCACCTTTTT 

TCAGCACCCA 

ACACCTTGTG 

GGATTGGGCG 

TGGTGACCGA 

TTGTCTAAAT 

CTACAGGCTA 

CGCAGCTCTT 

CCTTCTTCAC 

AAA CGCCGGG 

GGCCATCACA 

CTTATGCCCT 

GG CTG CATC C 

CGCGTCCAGA 

ATGG CCTAAT 

GTCAACAACC 

ACTGAAAATC 

GGGGGCATCG 

ATA CTGGCAC 

ATCAA CTTCA 

AGTTCCGCTG 



TAAG CTCAAG 
AGAAAGAAAT 
GGCTTTTAAG 
AACGTGAACA 
AGAGCGCGGT 
TCGGACGTCG 

GATCTACTAT 
ACCTGACCAT 
TTGGTGCGAT 
AAAGGACGCT 
AGACAGTAGT 
AGGGGATTAT 
ATATTGATTC 
AATTAAGTTT 
TTATTTTTCT 
ATTATTATTG 
TCAACTTTTG 
TACTGGATTG 
TGGATTTTTC 
AGAGACGGAT 
TTTTAACCCA 
GGTCAGTGTG 
CGTCAGTTGC 
TATTG CCTG A 
T ATT CG CTG T 
GTCCCCCTGC 
GATTGTTTTA 
GATATCAGTG 
CTG G TACAAA 
TTGTTGAATG 
CT \j jS\ TGG A C 
*GAAA 
CAG 

ACGGCGCGCG 
CGGG 

nuu v*o 1 unL 

CTTTAGGAAG 
TTCCGCGTAC 
CCTGATCGCC 
TATGCAGCAC 
GTGCGTCAGC 
TTTCATATTC 
CCATCGTGCC 
GACATGAACG 
ACAGTAAAAA 
GAGGAGGGTT 
TGTATTCTGC 
AAGGTG CG AT 
TCGTGATTTC 
TCACTAACAA 
CTGTGGCGCC 
AAGAAGTCGA 
GAGAAATACC 
GCCTG TTTTG 
TGATTGTAAT 
CCGACATTGC 
GTAATAGGGG 
TCATTAATCC 
AGGCGGTCGT 
TCGCTGGTCA 
CAGTCAGTAA 
TATTCGTTGT 
CCGTCAGGTT 



GCTCCATCGT 
GACTGTGAAG 
TCCTTCCAAC 
GTGAAGCGTT 
GAAGCGACGT 
TTCATTTCGG 
GATTTGGCGA 
TTTGAAATCT 
GTGGTTGTTT 
AGTCAGAAAG 
GCTGATATTG 
AAACAATTTA 
AAGCTTGAAA 
ATTATTATTT 
TAAAAATTAA 
TATAATAATT 
TGAGACGGTA 
TCCGGTTTCC 
ATTAGCCGGA 
AGCGAGATGA 
TGTGGCCAGG 
AC CAGAAA CC 
TGATTTACCA 
ACAGTGATGT 
TTGAACACCA 
CAGCCATGGC 
TGTTTTATCA 
CACTCTCTGA 
CGCTCAGTGA 
AACATATCCG 
CCGGTTATAA 
ATACG CTGG A 
CCATGTTGAT 
AAGGCAGAAC 
CATTATTACG 
AGAAAATTGA 
CCCTGTGAGG 
CTGAATGAGA 
AGGTAAGTAA 
TGACCAAGAG 
AGTGCGGGGC 
GTCGGTTTCC 
CG CATCGTAT 
CAGTGGCTCT 
TCCATAGTTA 
GCGGATCGCC 
ATCGCCGTCA 
CCAACCGGTA 
TTCTTTGAAG 
TTTACGGGCC 
AACATAGTTT 
GGGGTATTCC 
CGATGCTACT 
CTGACA TACT 
CTCGCGCTTT 
ATTGCAAGAA 
GTGTGGTGCC 
CAATCTGGCC 
GTTCGGATAA 
AGG CGGTAGG 
GCGCAGTGTT 
TATAGGCAGA 
ACAGGGACTT 
TATATCCCAC 



GAACGGAGAT 
AGCGGAGCTA 
CCGTATAATG 
CTGGAAACGT 
GCATAGTCCT 
GGTATTGATG 
TTAATCAGAT 
TATTTAATCA 
ATCCGGGAAA 
ATATTGACTC 
TGAACTACAT 
GCAATAAGCA 
TTAAAACAAA 
ATGGATAAGA 
TTCTACTTTT 
GATAGTTTAT 
ATAATTAACA 
TGACCATGAA 
CGAGTGTTGG 
TAGCTTTGGC 
AAAGCAAAAA 
G CCCGGG AAT 
CACTAAAACT 
GCTGTATTCT 
GTCCACGCCT 
TGCGCATCTG 
TGGTGAGGTG 
ACACGCGGCT 
TGAAGAGATC 
TTGCCGGGAT 
TAGCGCCGAA 
TCTCGCCCAG 
GACTATTGCA 
AGAAGGCAGA 
GCATGGTGTC 
AG CGTTAAAG 
CCACCGG AAA 
CGTCCTTTGT 
CCCAAACCTT 
AAGATCCCGC 
GTATCCAGTG 
GTGTCTGTCA 
TGGTTATTCA 
CCTGTTCTGT 
GCAAATCCGA 
TCATCTGCCG 
TATTCATATC 
CCAAAGAAGT 
CTGGACTTCT 
CGGGTTCCAA 
GGGCCATCAT 
CAGTCGATAT 
CACAAATGTA 
CCAGCCGCCG 
CAGATTACGC 
ATTCTTCGGG 
TAAATCACCA 
A CTG CTGG CT 
CTTGCCTTTG 
CGGGATTTTT 
ATCCTGGGTT 
GACTTTAGGC 
GGCAACACGT 
CTTCTGATAG 
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34261 GTTTCTTCTG TGAGTGCATC ATATTGCAAT ACCTCGGTTT TTTCTCCCGG CGGTACATCA 

34 321 GGCGTATTGG GGTTACCGTG ATCGGCAATT TCTTCCGGTG TCGCCTCACG GACATATTGC 

34 3 81 CAGGCATTCT CATAAACCGG TAAATCAGGT GAAATATTGC GGTCGGGAAT ATG CCAGCGT 

34441 TCAACCCAGC CGATGTTTTT AAAAACCGCG CTATCATAAA TGACATACCA GGTTTGACCA 

34501 CCAGATTGAT TCTGCCAGGC AACCAGAGAT GCGCCTACTT CGCTGCTGGC GTCAGACATC 

34561 G CTTTA ATTG AAGGGTATCG ATAAACATTT TGAGACATAA TTTCACTTCC GGCCCCGTTA 

34 621 TATTCCGGGG CCGGCTCCTG ATATCAGTTA GAATTGTCTT GTTTTAATTG ATGTTTATTC 

34 681 AGACGGCTAC GAACCTG CTG GCTGAACTCA TTACTTCCGC CACTCACATC ACGCGCGGTA 

34741 TAACGCAGAT GGAGGATAAT ATCGCTCAGC GACTCCAGCA GCTGATCCTG ATCGGAACCG 

34801 AATTCCAACT TCCACTGTGA AATGGCGCCT GTCCCTTCAA AAGGCAGGAA AAGTTCATCA 

34 861 TCAAAATTGA GCCTGAACAT GCCG CTGTCT TCCATGGCCG TTGAAATCAC CACACCTTGA 

34921 TTAG CCTGTA CGTTCAGCAA AACGTTTTCG GGTTTGGTGT ATTCCAAGGG GTTAAGCAAA 

34 981 TAATCGATAG TTTTTAAGTC AGCAGTACTG TAAAGCGTAT TGCTGAGTTG TACCAGTGAA 
3 5041 GCCCGTACAT CTTCATAAGG CCCCAGCAAT GCGGGCAATG ACAGCGCTAC GGTTTTTATA 
3 5101 CGCCGATCAG CGTGGGTCGG ATAATCGCGC AAGAACATTT CGGCGCTCAG TAAGAAAGTG 
3 5161 AATGAA CCCG TACTCTTGCC AATTTCCCAC TGTGATGATG TCAGTAATGA TTTTA CCG AT 
35221 ATGGTTTTTA TGATCTCCAG ACGTCTGGTG TTATGTTGCA AATACGCCTG ATC CATC CGT 
35281 TGTAAGGCTA ATTTCAGATG TTCTCCGACC AGCAGCCCCT GATAAAGATC ATTCCAGAGA 

35 341 CCACTTTGGA CGAAATTCAT ATCATACTGA CCTGTTTCGT ACTGCCAGGA GGCTTCGGCC 
35401 AGTAAACAGA GGG AATTAAC CGCATCATAG GCTTGCAGGT AAAGCCGGAG ATTTGGCTGA 
3 5461 TCATCCACAT GTATAACGCA TCATTGGTAN ANTTGTTCNN NNNNNNNNNN NNNNNNNNNC 
3 5 521 CCGAAGCATA CCGCCAAGAC CATCCCCCCG ACGGCCAGAC CGAAAATATT GGGAACCATA 
3 5581 TCCGCCACAG CGGCCGCAGT GGCGG CTG AC TGGGCAGCGA TCACACCTTC AG CCG CTCTT 
35 641 GATTGTAATG CGATAACTTC CTGCTCGGTG ATGGAGATGT TTTCATCATA GAG CG ATTTA 
35701 TAGTGTTGCT GGCGCTCCTG AGCGGCCCGT CGG CTGATGG TCAGTGCATC CAATGAAGCC 
35761 TGTTGCATGT CAATCG CTTG CTGTTG CAG A TTGCGGGTAA AGCTGTACAG CCCCAGTTGC 
35 821 TGCTGCATAC GGAAGTGTTC AAAATCGGTA TTGTCTTTTT TCTCCAGCAA ACTCAGTAAC 
35881 GTGCTGCCGT ACTGAATCAG CGTTTCTGCG GCCTCTTTTG CCCGGCTCAT GATCGGGGTG 
35941 AAACGATAAT TCGGGATTGC CCGGCGTTTC ATGCCCGCCA TACGATTAGC CACAACACGC 
3 6001 TGGTAACGCT G C CTG AG CAG ATCTTGCGGG CTGATGGGTT CATCGTATAA TCCGGCCGGA 
3 6 061 AACTCTTTA C CATCCAAGGT CAGGTTATGA CGTAAGTTAT ATAGACGCTG ATCCAACATT 
36121 TGCCACAGTT TGAGATATTC CGTATCAACA GGTTTGACAA ATAAATCAGA CGGTGCGGCA 
3 6181 GAGACGGATG TATCATATG T CACAGGCAGA AGTGGCACGT TG CTG A CAG T AAGCATTAAC 
3 6241 TCCTGTGCCC GTGCTTCACT GTTTTCATAC AGAGCCACAT CTTG CAG CGT ACGGGGTTGC 
3 6301 CAGTTTGCCG CGAGCAGAAT ATCAGGG CTG GTACCCAGTA ACATATTGAC GGAGTCATAG 
3 63 61 ATCTG CTTGG CGACAGTACG TGCACTGGAT GTCAGCTTAC GGTATTC CAT GTCTCCCTGA 
3 6421 T CT AA CAG AT TCTTGACATA GAAACGGAAT ATTGCTTTCC GGTAGTGAAT GGGTTCACTG 
3 6481 G CTGCAATGG CATCCGGATC GGTTGGTTCA ATTAA CAT CC GGTACACGGT GGGTGGAGGA 
3 6541 TCAATAATTG G CCGTG AATT CCAGTAACGC GGTTTACCTT GGTTGCTGGC CTGAACAAGT 
3 6601 TCATCTTCCA GCGGATTAAA AATATAGTGC AGCCATTCGG TGGCCTCTTT TAATCGTTGT 
3 6661 TCTATATTCA GTCGCCACGC GACCAGAAAT GGCATATGGA AAAACAGTTC CCAGAAATAG 
36721 ATCCCATTTG CG CCATTTAA ATCAATCGGC GTAGGGAATG AACCGGGTAT AGGCTGTTCG 
3 6781 GTAATAAG CT GTGTATTCCA GCTCAGTACC TGCGGGATAC CCTGACTGGC AATGGCGATC 
3 6841 AGTTTTTTTG CAAACAGTGT ATTAAGGCGA ATGTTTTGTG GCGCGTTATC AGTTTCATCT 
3 6 901 GCGGGGAAGG AAAGGAATTG CACCTGATCC TGTTCA TTGA GTTTAATCAG TTCGCGAATA 
3 6961 TG CATACCG A TTCTGAACTC TTGAGTACAG CTGGCACTTT CATTG CCAAC ACCACCTTTG 
37021 GGCTTAAAGA GAAGTTCGGC TTTCAGGGTG ATTCGATTAT CCGACCCCAG CTTGATTGAT 
3 7081 GGATAGGTTA AATCAAGAAC TTTTTCGCTC AGTACCAGTG GTTGTTCATC CAAGACAGTA 
37141 TTATCGTGCA TCAGCCGGAA AGAACCGTTG TAATATTGAT GATCTTCTAT CGCACCAAAC 
37201 TTAAAGTCAG ATTGAGCGAC AATCTCCAGT GTGTCATCAG TGCCATGAAC AAAATTGACA 
3 7261 ATCAGTTTGA TACTGTCTTT GCCGAAATCA GGGTTCATTC CGG TTTGG AT TCTCCGGCAA 
37321 TAGGAAAGCG TTCTTC CCGG GTTGCCGGAT AGAGCACCAT AGTACGGTAA TCGATAGGAT 
3 7381 TGCCTTAAGG CATCCTTGTG TTCACGTGAG T AATAC CAG A CCAGGTTGCC GACATATTTT 
37441 CCTTTTCGTC CATCAGCATA TTGGTCATCC GGCAAATCAG TAATTT CTAC CAG CAGTGTA 
3 7501 TCGCAGACAT AACCGAAGGC TTCGTCATAA TCATAATCCT TACCTTTCTT ATCTGTCCCC 
3 7561 TGAAGACGGA CAAACGGAAC CAGAGCCAGA AACGGGTTAT GCGGGTCTTG CTGTATATCC 
3 7621 ATCACAGCAA CCATCTGGGC CATCCGGTAT TGCAGATGTC TTCGCGCAGA ATGGTGGGTG 
37681 TACTCCAGCT GCCATCATAT TTGGCATAAG CGATTTTGAT CCGGTCAGGA ACGGTGTGGG 
37741 AGGAACCCAA TCACCCGCAC TAGGCTCAAC GTTTTGGTTA TGCAGTGATA ACGCAGTTGT 
3 7 801 ATCTTTAGTT TCAGA CTGTT CTTCAACTTC CGTCCAGGCA ATATACAGGC GATTATTCAG 
37861 GAAAATGGGG CGTATCAAAT TGGGGTCTAC GCTGCCCAAT GGCAGGTCAA TAGGTTTCCA 
37921 CTCGCTCCAG GCATTGGGAG ATAACGCATC GGTATCAGGA TGGCGTATCG AAAGATTCAG 
37981 TGAA CG CCAG TAATATTGGT ATGGCTGTGT ACGGGTACGT C CG ACAAAG A AGAACTTATC 
3 8 041 GCGTTTGATG TTAACACCAT CTTCATAACC TGCGATAACT TTCAGGTTAC TGACATCTTC 
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Fig.2. 

3 8101 AAAATTATTC AGATAACCGA GCACCGCTTG TTGTACAGAA TCTTCGGTAA TTTTT CCCTG 
3 8161 ATTAAGGGCA CTTTCCAGTT GGAAGAAGAA TTCTGTTTTA TTCAGGCGTA ACAGGGGTTC 
3 8221 CAGATAGCTT TCCGGATAAG TCCGTAATAA GCGATCCC 



N=unspecified base 



Fig.3. 
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Fig.2. 

2 TCCACAATTG CCGGAGAAAA TCAGTCGGSGA ACTSCCWTC ATTATTQBTC ACTTATTAAA 

61 CGAATTTGCe GACCAGAATA AGGCTAAAAA ACTGCTACAG QCGCAACGCQ ACTCGAACGA 

121 AGCGTTAACG GTAA*G*GTC ATTCGGATCC GCTGTATCGC TTTTGTGGTT ATCTGMGTC 
TGTCAATOAT ATCACCGOAA TGAAGATGGG CAATAAAAAC ATTAGCCCAC GAGCACCGAG 

341 ATrcrAcrrro tatcatgcct atctctcttt tatggaagcg cacggctttc aacgtccgtt 

301 AACACTGAttT AACtTTO^tG AATCCATCCC caagattatg C7TC3AATACC GGAAGGAGTA 

3fil TCSAAAAGTO CGAACCAAGA AAGQCTATTC CTATAACGTG OAATTATCOG AAGAGGCCGA 

4 21 AGAATGGCTA COGTCAGTGC CTCACTCtTO AGACTCTAAA TCACCTCTAr AAAACTTTGA 

4 61 GCTTTAAGTC TGCACTCCAT ACACAACTTA AAMATQTAA TTGTATTTAA AAGAAAATAA 

£41 TAjGATGTATA GTTATTTTTT AAC TATA CAT AAGCTCTAGA TSCTCTTCAT TGGTCTAAAA 

602 AATGGGIXSAA CAOGTGATAC AGTCAGTGAA TATCA TATTA ATTACCGTAA ACCCAGATGT 

&61 AGCAAOTCTT TCAG3GAATT G7GCAGAGGG TGCATAACTG ftGAGGGHGAA AAAGATTTTC 

721 AGGGGGSCTT ATGGCAGGTA AACAAAATCft SAAGCAAATA CCQTGCACAA TCTGGTTTTT 

7ai ATTTTTTG5T ACTACCTCAA ATTAAAATGA TGTAATCATC TGAtTTTATT TAASAATAGA 

B41 AGTTAATCAC AATTTCATTG ATGGACTTTC ATTCACACTG GTATAGATAA ATAATTCTGr 

90" TATATCCTGT TTCA7TACGC ATTCATCAGG AGTGCTGTTA CAGGAjGACAA GAA^STCACA 

PC- CATCATTTAC TTGTOCpTTAA AGGGCAAGAA GCAGGGTTTA ATTTCAGOGG GTTGTTCAAC 

If jl GCCTGAATCA ATTGGAAfcTC GCT^lTCAAAA AGGACGTGAA GATCAAATAC A GGTA TTGAG 

IDfc- CCTtfAATCAT TCGATGAGCC GTGACCAGAA TGTT^TCAT CAACCOGTCA CTTTTGTGAA 
/.CCCATfGAT AAATCCTCTC CCCTGTTTGC TGG^TCCCAG TTTJX3TGCAT TACAGGACAA 

12C" GCCAGATGGG ACAACTGGAG TTCTTTTATG AAATCAACCT GACCAGTCCC ACGATTGTGG 

22 1 : . ATA7TTCCTA TAATTATCCG GEATTCAATC AATGATAATG GTGCGATACC CCATGAAGTG 

13^- GTGATGCTCG ATTATAAGTC C/sTTTCATQC AACCACATCG CC&CAGGACT TCGGGCTACA 

13S 1 OCATACGCAA TTAGCCGGAA gtgaagaagc aagccgcttt tatctogggt CTCGAATGTT 

1W A&GCCaCTXA AGAAGCCGCT GSITGAAGAA ACCCCGGrAA AACCCGCTAA ACATCATGCC 

15C5-. CGTTATOS7T GT^TCO^tGA 7XSACG0CAAT C7TTTAACCG AACGCAAGTA TtG-CGTTTG C 

lSfc" CTGCClSGATG GTCAGATAAA AGAAGGAAAG ACTCiATAAAZ AAGGTTACAC CCAATGSrAT 
CTTACGGATG ACAAAAATAA ACTTGAATTT CAT/=rrrT/^ AGGATTAATA CCATCCCAGC 

lfi£l CTATACCGTT CASACAAAAA TAGAATCCAA CGT^CCTGTT GJiAAACCTGC TTTACG ACTT 

174^ AAC t c TTTAT CGTAAGGATG CAAAfcGGAAA TTTO^TiTr TTGCTTGATe ITlTJ'tftGGA 

190^ GAAACTACAG AGTAATTATO AAACACAAW GC^TArc?.rs CAGGAAATAG AOTACGATCT 

1661 TTCTGTGATT TATAttATGC AAATTATGCT TCAT"GCAAA CATTGGCTCAA A^ATATTTCC 

1921 GGCA^TGCAA ACCCATTXTA AGAAAATGT/. TACCTTTCCT GAATTAACTT CCGGTAAAGC 

IPB'jl CTGTTCGGAG AAAAAACGGG AAAATGCC7C TTATTTTCA?* AGTACAGTTG AAACAAAACC 

2D 41 TGTCMCGAC GCrGGATAATA COCTTGACTT AAATATCACT ATTCCTGAAC OACCTTTTAT 

2101 TGCCAAAGAA TATCCCATTG GTCACCCAC^. CGA7CC-.TTT OAAAAAAGTA AAATTGA.ATC 

^l&l ATAAATACAG GACAGGTTAf CG-AAAAGA/i T TTATCCGGAT CAAAATGGAG CAAG7TTATC 

222: TCAGO^CGCG AGCACACTAT TTTAGCTCto TTTTTAAGAT GATTA^CTCT TAATGTTCW 

2231 TTTTAATAGT G TTTTTA TO5 AGTGAAATTT AftT^« IZACAG GCAATTCTTT AGACTTTTAT 

2341 AGAAAACTAA AGAATTAAAO AACAAGATTC ACATTTVA^ TTCA-AATATT AATCAAAGTA 

2401 TGCTTCGOCCC CTGAGTTTAT GTGGCCCTGC CnCTTTTTTT TATTGOCTGC CAATAGATAG 

2461 ACCAGATATT TATGAGCAAG OG5CACGAGA ATTATG^^AA TATOGCOGAA CTAAAATTGG 

2521 TCAACTGGAA ATTAAGCCGG GTCAGCGTTG CCGACArrtIT AAA5GTACTT TTTATAATCA 

5591 ATATGGTQAA AQAArATGTG GGTTAGATTS GCTGACATTG GGAAGCCTAA GAGATTCAfiA 

2 £41 AAATATGATG ATGAQGTTGA TGATGAAGTA GCTGGTATTA CAATGTO»3 AAAATTOAC& 

2701 GAATGSTTTG AAAAATCAGG GTAT5AAAAA GTATTTAGTA ATGTCGSCTT ATCCCATTCT 

2761 AATATAAATG ACATAQTAAC TCTTAg TGAT TACTiTAACA AAGG A T A TCh TGTTGTTACr 

^621 TTGATITCAG CAGGAATCTT AXCAGATTTT GGTGACATAG AAACATCAGG AAAAAATCAT 

2692 TG-3ATAGTTT gggaaggagt agtagaaaac TATGAGAAAG AAAATATCAC AAATAATTCA 

2 941 GATMXSAATC AATATCTAAA TTTAAATCltS TTTTCATGGG GTAAAGTCGA ACATCAAAT7 

3 DDI AAAAAAAAGA AATCACTAGA TTATGTACTC AA CCATA TTT TTTGAGGGTT GGTTTTTAAA 
3D&1 CCAATC3AAAT AACATGAAAA AAATATTAAT TAT7TTTATT TTTTTACTTT ATGGTTCTGG 
3121 TAATCCAACG CCAAAASTTT TACCAAAA^C AGAGTTTCTT CCTCATGCAG TGATAAftTGA 
3161 ACCATATCAG G CATC AATTA CCATCACAGS AGGTGCATT^ AATGAAAAAA OCGTTTO5GT 
3241 AAAAA7TCAT CCTACTCGCT" CAQGACTAAC ATGGAATCCA AAAGATAGTT CTTTCCTATA 
33 GGGT33AAAA AAAGAAATAA GAAAAGATTA TCATCATATA AATATAACAG GTACCCCAAA 
3361 GAAGXCAGAA TTSATAAAAA TTGAAGTGGT AGGATTTACA TTGGGTACAA TGTACGCAClS 
3421 GAAAGAGTTC ACTATAAATT ATACTATAAA AGTAAGGGAA TAATTCTCAC TATCAGAATG 
34B1 GTGATTTAAT TCGCCATTTT TATACTTTTG TATACTCT1T CAACATAATC AG G ATT CTTT 
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3541 CrTATTATTT TTCATOGTOC TAAAAAC^TT rATKpCAAAA ATAAATTAAG TTAATCAQAT 

3601 AAA TT AT CTG CATTACTOTT AT AA.T OGATA ACAOGWAAC CTOACTTTCT GCd^TTCTT 

3 6*1 ATGAACTCDA AGATAATCCT TTCTGAjGCCT "3AACGAATCA CATTGCAACC ACTCG CT TUXp 
3721 AATCACCCAC ACOSGGACAT TOGTACGCGA GGAAOGGGTT TACTCATGCT TQCCAGA5GG 
373-1 AGCAAGCCGT CCCAGATCAC COCTOAAATC OGATOCACTC TCOWOTTAT CTGTAATMG 
3341 GTTCACATGT GGCACAGATA GCQQGJLTTAT TDOGOSGTCA TGCCGOAjSGC CGGTATCTCG 
3901 CCATCACGCC TGACATCATT GCCACTGCGC TCGAAGCCGC CAGOGCRGAS TCCCTGAOGT 
3961 GCGTCGAAGC CAGGCAGGGT TTCCCTGCCT TGTACGCTTG AAAOC5CTGGC GAATACCCTG 

4 031 AAAAAACA^G GOtrrCCCCTA TAAAOGCCCC CGCCTGTCGC TTAAAAAAAG CGCAATAAAA 
4 DSl CGGAGTTTGC TGAAAA ATCC GCCTTCCTGA ATAAAATTAA GGCCG3AGCA CAGTCASGAt 
4141 ATTACOTTCT GGTCTATTTT GA-3TTCT03G GGCSTTAAAT TACACGWITA ACACGCTGTT 
4 201 TTACCAGACA ACGTCAGGCA GTATCACC5CG AGATGAOGTC ATTSATTTTT TAGAG CCGGT 
4261 GGSCAGACAA CGGACAACCO CCTGACATTT TTAGTGTTSG ATAATOQGCG EATCCATCAC 
432 J GGGATAGAGG AAAAAATGA-S AAATGGCGGG TCAOGAGAAC AGAACCTGTT TTTATTCTAT 
.13 a I CTTCCCGCTT ACAGOCCAGA GCTGTATCTG ATTGAAATCO TCTGQAAACA OTCCAAATAC 
-34-11 GACTGGCGAC GTTTTATCAC CTCGACTCAG GATACAAT3G AATATGAGGT AAATACTTTA 
4501 TTCAAAQGTT ATOCCGACCA ATTTCCAATT AACTTTTCTT GAGTaCTTAG TAAGAATAGA 
4561 GTCAGTCGAG GTTTTTTC?*T TTCGGGTCGT GGGGATSATA CTGAAAATTT GTTTGTaAtC 
4£2v. TCTCAAAATT GcT»STTTCTG TGGC^ACGTC TOTCTTTTG^ GATATTOTTT CCATCAAGTC 
4661 TGTCAACATA CTGTTAAGTT AGATT3TTGAT AAAA-3 AG A CT GAATTATAAT ACAAAACAAT 
4741 AAATCACTTG GACAATATlT TATTTCACAT GAGACATTAA GGTTGATTfi' CCCAATCTGG 
48 01 TCAGWATAA CCGAATAAGG ATCTTCAAAA ATCATGGGAT CTTaCTTTTA TCAAATCAAG 
4fl6l TTAACG TAAA AGTTGATAAA GAAAATTATT TAATTCTAAG TGCCGTMQC A rAAATATTT 
4S21 TGTGT7TTGT TAATSA ATG ft ATAACCaGGT AAGCTGGATT TTCATTTTTT AA TTACTOG T 
4381 T ACAATATG £ TATTTATTTA TATAAAGAGT TTGTGCCCAT TTAACCAGTA AACAAJX TTTC 
5041 TTCAAOCSTA ACTTAGCTTC ATCGACTTTT GSCCT CGCC T GGTCAGAATC TAGGGCCGTT 
S101 A7CCTATTTA TTTATGATAA ATAAAATTTA ATTATCTTTA ATAAGCTGAA TATGTGSATT 
5151 TGTGCTCAAT CTTGGATTCA ACTATGTATT CCTTTTGGTA CCCTCCTTTA TTTTAAGGCA 
522"! GATOAAGAGG ATCCCAACAT GACACAATaT CG ATTA CG A C TGTAACATTA AAGTCAGTTA 

52 a i taa/lttttat gattaaaatg aaattttagt agaaaatcgt attctattcc gccat-ttaca 

5141 ATAGCATCCT CTTTAATATC ATTAATCTCA GATAAAACAA AEAATTA CAA TGTGAA^rAGA 

5401 ATAATGACTT ACAAAATAAG CACTAAATCT TCAQATGAA" TCTTAACT3A CAACACTATT 

5 4 til TTATAAAATA ATTGAOGTTA TTATGTATAG CACGGCTCTA TTACTCAATA AAATCAGTCC 

5521 L2ACTCGCGAG 3GTGA3AOGA TGACTCTTGC GGATCTGCAA TATTTATCCT TCAGTCAACT 

55 Bl GAGAAAAATC TTTCATSACC AOCTCAGTTG GOTAGAGGCT C GCCAT CTCT AT CATG AAA C 

5^41 TATAGAGCAG AAAAAAAATA ATD3CTT6I7T GfiAAGCGCGT ATTTTTACCC GT3CCAACCC 

5?D1 ACAAlTATCC GOTGCTATCC GACTCGGTAT TGAACGAGaC AGCGTTTCAC &CAGTTATGA 

5761 WAAATGTTT GGTQCCCaTT CTTCTTCCTT 7GTGAAACCG GG*tTCAGTGG CTTCCATGTT 

S321 TTCACCGGCT GGCTATCTCA CCGAATTGTA TCGTGAAGCG AAGGACTTAC A TJ ' IVP CAaG 

5GB1 CTCTGCTTAT CATCTTCATA ATOGCCCTCC GGATCTGGCT GATCTCACTC TGAGCCAGAG 

5941 TAATATGGAT ACAGAAATW CCACCCTGAC: ACTGTT7TAAC GAACTGTTGC TGGAGCTATT 

60 01 ACCCGCAAGA CCGGAGGTGA TTCGGACGCA T7GATGGAGA GCCTGTCAAC TTACCGTCAG 
5<J€1 G CGATT-GATA CCCFTTACCA *tCaGCCTTAC GAGACXATCC GTCAGGTCAT TATGACCGAT 
£±2 1 GACAiSTACAC TGTCAGCGCT GTCCCGTAAT CCTGAGGTGA TGG65CAGGC GCiAAWGOCT 

61 SI TOiTTACTGG CGATTCTGGC GAATATTTCT CCAGAACTGT ATAACATTTT GACCGAA^AG 
6241 ATTACGGAAA AGAAOSCT^A TGCTTTATTT GCGCAAAACT TCAGTCAAAA TATCACGCCC 
63.01 GAAAATTTCC 05TCACAATC ATGGATAGCC AAGTATTATG GTGTtGAACT 

6 261 CAAAAATAC- TCSOGATGIT GCAGAATGGC TATTCTGACA GCACCTCTGC TTATGTGOAT 

6421 AATATCTCAA CGGGTTtAGT QGTCAATAAT GAAAGTAAAC TCGAAGCTTA CAAAATAACA 

64B1 CGTGTAAAAA -CAGATGATTA TQATAAACAT GTAAA7TACT TTGATCJGAT GTATGAJU3GA 

£541 AATAATGAAT TCTTTATATCS TdCTAATTTT AAGATATCGA GAGAATJTOG GGCGACTCTT' 

€fi 01 AGGAAAAACT CAGGGACAAG TGGCATTGTC GGCADCCTTT CCWTOCCCT GGTAJ&CCAAT 

6661 ACTAATTTCA AAA5CAATTA CTTAAGTAAC ATATCTGATA ATGAATACAG AAATGGCGTA 

67 21 AAAA TATAT G CCTATCGCTh XACGTCTTCC ACCAGCGCCA CAAATCAGGG C3GCGGAATA 

6?ftl TTCACTTTTG AGTCTTATCC CCTGACTAfA TTTOCGCTClA AACTGAATAA AjGCCATTCGC 

SB 41 TTGTGCCTGA CTAGCGGGCT TTCACCGAAT GAACTGCAAA CTATCGTACG CAGTOACAAT 

6901 GCACAAOSCA TCATCAAOGA CTCCQTTCTG ACCAAAGTTT T-CTTATACTCT GTTCTACAGT 

6961 CACCGTTATG CACTGACCTT TCATCATGCA CACGTACTGA ACGGATCGGT CATTAATCAA 

7021 TATOCCD3AC GArGfACAQTG TCAGTCATTT TAACCGTVTC TTTAATACCC CGCCGCTGAA 
AGGGAAAATC TTTGAAGCCG AC5GCAACAC OOTCAGCATT GATCCGGATC AAGAACAATC 

7241 TACCTTTOCC CGTJCAGCCC TCATGCGT&3 tTCTGGGGATC AACAGTGGTC AACTG^TATCA 

7201 GTTADGCAAA CTGGCGGGTC TATTGGACAC ACAAAATATC CTCACACTTT CTGTCCCTGT 

72 61 TATAT^TTCA CTCTATCGCC TCACGTTACT GGCCCGTGCC LATCAGCTGA CGSTTAATGA 

7321 ACTC^TGTATG CTTTATGGTT TTTC^CCGTT CAATGGElfcAA ACAACGGCTT CTTTCTCTTC 
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Fig72. 

CSGGGAGTTC 
OGGAAATCAC 
CACOOGAAAT 
AAAGTAGTCA 

GCGGCCTGLAA 
AAACGACCCA 
CACTCCGTCT 
G^GCGAGAAG 
CCACCAGTO3 
AGCAGACACT 
ACGSA&SOCA 
CGTGTTGCAG 
CGCTGG1SAA 
GGGATAAGTC 

CGAATATCCA 
TGATTGATAA 

TGTCAACCCG 
GGGTCTOTCC 
Akr.CCCGGAT' 
CGCTGGAAGA 
TGTCAGCGCT 
AACGPGGGAG 
GGGTGAACTG 
ATACAAG5AT 
AAAAAGAGGA 

CAACGCAGGT 
CATCAGGCTT 
ACCCGGATTT 
CCTTCAAAAA 
TATCATTCAT 
TTTTGAAGTG 
GATGGAAAAC 
TACGCTACAT 
CAAACAAATC 
TTT ATCATQ 5 
ACCCTTTTTA 
AGATTACACT 
CGASTTTCCA 
AACCftGTCAT 
GATATTTACT 
TGATGTCAAA 
TATTGCTTCC 
AATCGATOCT 
GACCAAAGCC 
GGTAAATTAT 
TATGCAQCTC 
CAGAGCAAAC 
TCCSTTGGGA 
TGGCGATGAG 
Q CI7TTATTAC 
TTAICCOSAA 
TAC«ATGAC 
7G7A7TAATT 
TAT^AAGAAA 
Q&Z/: "1 "i ' CAAT 
TCC7T0CAGC 
GT^TATAATC 
CGG^CGCTGG 
CGTCGCACAA 
ACTtAtTCTG 



TCAOGGCTGG 
CACTGAAGCG 
CAjGTAATCTG 
CCGGGAGCTT 
ACCAOAtATC 
TATCGOOSGA 
ACTCOTTCAA 
CAGTGAAGCA 
CCAACCGCCC 
ATTAAMG3C 
CAOTSGCGAC 
TGG5TTCCCG 

tgcatacatg 
tatccgttac 
gcagacgctg 
ggattatacc 

GCCAGAAGGG 
TCAGGTCTCT 
CTA CATCAAC 
CCAGTTTTTT 
GCTGGTTTAT 
GATGGATOAA 
GGCCTTTAAA 
ATCACOGACA 
AACCTGCOGG 
C5CO3CO3AT0 
GCAATACGTC 
AjGTGGCGAAA 
GTTTCTGCGT 
GGAGGCGGTC 
TCAGGGCGAG 
TGGCGACAAC 
GaTGGAGAAC 
ACTCAAGGCA 

GGGAATATTC 
AACGCOGCTT 
AGCSCCAt^A 
CAAATACCGT 

ttaaaacg&a 

AGGOGTTTGA 
TTTTCTCCAA 
TTTAAAAAGr 
TCCTATCAAT 
ATTAOGGTGG 
TTGOOTGCAA 
TCATOSTTGG 
AAAGACOQGC 
AATOOTGAAG 
QGGGTGTATC 
AOTGGCATTG 
GAAGGCTTCT 
CGffTGGTTTA 
AGOSGAATGtT 
GSGTATTACA 
AACACTTGGG 
AAOSXTCSCIXJ 
TACAAAGGAT 
AGTOCCAGCG 
GTTTGCTACA 
CCGCCGGCTA 
AAGA-5ACACT 
tatgacccsa 
CGCGGCXtATA 
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TTATCTO3TT 
A XVTGGTT AT 
CTTAATACTC: 
CAG^CTGAAA 
GCQOSGTAXJl 
TTTAK5AT5C 
TTCTCCCATS 
GAGCTTTCTG 
GACAACACAA 
TGOGAAATCC 
A&ACTGGGCC 
CCOTCGTGAA 

tggcatcagc 
gtcactgcat 
qca1jaaaata 
gcagagcgcc 
gtgtccctgc 
tctgccataa 
cgggogctga 

ACCGACtGGA 
TATCCGGAAA 
CTCCTOGAAG 
ACTTACCTGA 
ACGTCAACAG 
AATATTACTG 

CSGTCATATT 
AATGGTACTG 
CATGATCGCA 
ACTGACAAAA 
G AT^ACTCTG C 
AATAAAAATG 
AOAGCACTCA 
ACGJlCTTGGT 
TGAATA7GGG 
CGCAGATAAC 
TCACTGTCAG 
AACTGACGGG 
TAAACATTAT 
AAAACTATAT 
TTCTAACACC 
ACACAATTTT 
GCAGTTATGC 
CATOtXWCTG 
TAGCT3-3CAS 
AjCAGWTTGA 
CCTTTACCAA 
GAQTGCTOTQ 
ATATTCTGTT 
GtATTCGTCT 
ATACTATCCT 
TTCCCAACTT 
AAATCCATAT 
fATCCGATAC 
TGCATGAAGG 
AATCTOCTTT 
ATCAT^TTC 
TTTTGAATOT 
CCCTCTATTA 
GGAAAAACAA 
TATOGTTAAC 
CCTGGAATtjC 
CACAMATAA 
TGGCCTATCG 



GTATCAGGTG 
rATGTATOCC 
TCOMCCCCG 
TTCTCGCGCC 
TCCTGTTGTC 
TGGTGCTGAA 
TAATO3GACA 
^SCTGGTTCAT 
TATTGATACT 
CGGCTCTOAC 
TC03TGATGG 
CCAACTTCAG 
ACTOCTCACT 
TAAACAAAOC 
TGGCAGOOBG 

tgagtaaOgt 
acagccggga 

AAACGACCOG 
ACCGGATAGA 
O^GTOAATAA 
ATTACATTEA 
ATATCAGCCA 
CCOCTTTGJLA 
CAACACCHGA 
GCGTAACGTG 
TTGGJICGAAG 
CAGGGAAOGT 
ATCCGGTCGA 
GTTGGAGTGC 
AACCTGACAC 
TOGTGT7TGT 

CCOTTACAGC 
AAGAAAGGCC 
TTCTGCCATC 
CAGTAAATAC 
ATATGATGGC 
GTTG-SATCAA 
GGCG-GTTACT 
TCCATCAGTT 
AGTTGAA-^AT 
aaaCaccott 
tgttgatggt 
gctggatatt 
TAAAACCCAC 
TGCTATGCCG 
TAArATTGCT 
TAAGATCAAG 

t^ct^cOtcaa 

TAATACCCTG 
GACAATGGAA 
TGTTCTGCCT 

CGGGAArcrr 

<3TCOGAW.Cr 
TGTCAGATTG 
CTTTTATTTT 
AGGAAT0ACG 
TTCTATCGCA 
CTGGGAATGT 
TTCGADGAAG 
GQAGAAATCG 
CAATCDSTTG 
AGTTGCCACC 
CGAAC7TOACC 



AGSCAJCTTWC: 
AGAGTTCAGC 
TOTTA GTGftA 
GTTTA.TTGCT 
GACTOATAAC 
ASAGACGCTG 
GTTATCGCTT 
TTOOSA TTTT 
CTSTTCTCAC 
A03CTOGATA 
GG CTGQACAT 
TGTTGGCAGG 
GATGCCtJTCG 
CGAGTCGAAT 
ACteAGTACA 
CTTCTGCAAT 
TGACCTGTAC 
ACTGGCAGAG 
GCC7AATGCC 
COTTTACAGC 
CCCGACCCAG 
GAGTCiM?crc 
ACCGTGGCAG 
C7"GACCT«3T 
CATATATCAC 
ATTGATACAG 
THGCACCTTA 
AACCTATOAC 
CCCCTGGTCT 
TCAAQ5GCTO 
GTACAAAACC 
GACCATTTAC 
CAACTGAAAA 
AGCTATOGTTr 
©2TGATGATA 
TCCAGCGATA 
AjGTCGCAATG 
AGTCCCAGTA 
CTGA-TCTGGG 
CAAGdCCACT 
AATTATTATG 
TTCACGOTTC 
AATAATTCTC 
GACACAGGTA 
ACCTTTACGG 
TTkCACCTTTA 
CCTCTCGATA 
CAAACATTAT 
ACTCATTOGG 
CTGGCTTCTC 
ACCCAiGCGGT 
AAATATfiACC 
GGCGGTAACA 
AGTATCACAC 
GGGGTTGGAT 
-GATGAGACAA 
CAACAGG3GA 
AO(5GGCTATT 
TCTATTACAC 
CCACACAATG 
CCCCCTGGAT 
GATGCCA-m? 
TTTATGCGCC 
CGCGATGCGr 



TOACT5A£»S 
GGGAATATTT 
GACATGOCAC 
CGAACGCTGC 
CTGCGGCCGG 
AQTGAT&AC5G 
TCCGTGCAjGA 
STGGTAGTGC 
TCTACOSA TT 
TGCTGCGCCA 
CAOTATTWrA 
ATATGAACCC 
GTTATCCX3rA 
CT5CCTGCCT 
CAACAGOCTC 
TGGTTTCTiSG 
AG CTATTTCC 
GCCATTGCCG 
CGMCCGATG 
ACCTOGOGCt 
CGTATCGGGC 
AGCCGO^tLCA 
ACCTXJAAAGT 
TTGTOSGCCA 
GGATGCAGGC 
CGtfTCAACCC 
TOStGGGTAG 
CGTTrTACTC 
TACGATA^CLft 
GCGCTGGCCG 
C^GGTGAGTT 
G6CGATGGCT 
ATA'IXTTTTGA 
TCGCGCAGGA 
CtTCTGACGGT 
ACCTTGCTAT 
TCATCAGAAA 
CGGCAATGCA 
GGCCCCGATC 
TGATt&AACGC 
CCAGATTGTT 
GTAGCAATAA 
AGGGCTTCCA 
TTAACAATAC 
CCAGTGACCA 
AGCCACTGGA 
TCGTTTTTGA 
CGOTGAAACG 
G7X3CCCAATA 
AACTGGTATC 
TACMGAACC 
CTGCTGAACA 
COGGAAGGCA 
TGTTTGTCCC 
ACCAGAAAAT 
AACAGCAATT 
TCGTOAAAAA 
CCGCCC03AT 
CCCGATGATTG 
QATAAACTAC 
CTGGAACTGC 
ATCCGGATCC 
TGTTGGATCA 
TGAATGAAGC 
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Fig.2. 

CAAGATGTGG 
CCAACAGTGG 
TCAACAAGAC 
CTCGTTGGTQ 
TGOSTTTCCG 
GGCGAATTAC 
AGGGCOSTAG 
AGCGGGCCCG 
CAGAGCATGA 
CGACACAGAG 
TATTGGCAGA 
AOGAGGATAT 
AGTCTCTGGC 
TGTTOGGTTT 
TGATGTCOtT 
CCTACCGCCG 
AACAAATCGA 
TGGAATATCA 
AATTCACAAA 
AGTTCTTTGA 
TGaCCGACAA 
TGATGGCGGG 
GTGATGAGCG 
TATCATCAGA 
GCAACGTAGG 

AGTTGAAACA 
CGGTGCTCAA 
CCtACGGOGT 
CGTTTGAAGG 
C7GA7CGACA 
CCATTCGTTC 
GTTTTTATGC 

ggatcactaa 
tttcactgcc 

ACAGCAjGTAC 
TCAGCCTGCG 
OSGATG03GA 
CAACCrCACT 
GCGTGGCAGA 
AGAMTCTTT 
ATCATGCACG 
AAACCGTCAC 
ACTGTGATGA 
AGTCCACTAT 
TGTTGATAAT 
GAACTCCGTA 
AAAATGGCGT 
CCGTCGCTTG 
GGTTGCAGAA 
CAAGGTTTCC 
GATGATSTCC 
GCASTCCAfG 
ATATGGAGAA 
GTGCTCCGGT 
AACCACTGCC 
ACOOCOGTCT 
OGGAAGGTGA 
CGCAGSCAAA 
CAAAtAGTGT 
rTCATTTOTC 
TCAGTOATAT 
GCTACTGGCC 
AAATTACGGG 



TMttTGCGTG 
GCCGCACOGT 
CTTACGGOGC 
GTTTOGTCCT 
CCTGGTTAAC 
GCGAGCCTAC 
TfcCAGTGCTC 
CAATCTGGTA 
TGAT5CCCAT 
CATCCGTATT 
GAGCOGOOGC 
CAAOCACGGA 
CG-3GCASGCG 
CGCTTGTGGC 
TTCTGCCACA 
CCGCOGTCAG 
TGCCCAGCTG 
GGAGACCCAG 
CAAAGOGCrr 
CCTGACCCA& 
OGGTGTTACC 
TGAAACOTTG 
GGCACTGOAA 

caactttaat 

TGATTTGAAA 
AGTGAGTGTC 
T7ACGGCGGC 
GAATGACAGT 
TATTTwCGTXJ 
C5AAA5COCTO 
TTAATTAAAA 
AGGGTTCAAC 
AAGGAATG3G 
CTTGCCQATC 
TGCTO3CAAT 
TACCGCCAAG 
AGTGTTGAGT 

qrrttWGGACG 
AAAAATCGTT 
TTGGGTACTT 
TATTGCTGAC 
SCATACCGGG 
GCATGAACTT 
GGCAATACTC 
GACTGGTTGT 
CCOGAATTCA 
TGTCQTCCGG 
TGTCGCCAAG 
GAAAGACCOG 
TTGCTOCAAA 
CCGCTGGAAA 
CCGCAGTTAG 
OGAATTTCCG 
ACGGGATATC 
ACATATTCCS 
GOATTQGGTo 
ATCGACACCC 
ACTGGCTGAT 
ACGTCTCTGG 
AAATAAGCCA 
GACAGGCTOC 
GAACCIGGGG 
GAAACGTTTA 
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UT'l 1GSAATT 
CrCTTTCCGT 
TAGACAACGG 
GCdGGAATAT 
CTCCGCCATA 
GATCCGAAAG 
CCCGGCACAT 
GCGCAATTAA 
GAACTCACCA 
CASCAACGAA 
AGTGCACAAA 
GAACAGCGTC 
CTtTTCA^TAjC- 
GGCAGTCGT? 
GCTTCCCAAT 
GAGTGGGAAcl 
GAAAGCCTGA 
C&GGCCCATA 
TACAGTTO3A 
TCCTTCTCCC 
TTTATCCGGG 
CTCCTGAATC 
GTGACCCGTA 
CTCACCGAAA 
AAlXSAA^TAA 
ATTTTCAGCG 
iLCTTTCCCGG 
AGCATCGTCA 
GGTCAATTTA 
AATGACAGCG 
CTGGAGAGCC 
CATTGTGATA 
ACCTTTGAAA 
AGAAGCACTC 
TCT5TCCGGC 
GGGTCATTCG 
O&CGTTCCGC 
ATTGTGCCGG 
GTTCTGACAC 
DGTTTAGAAC 
TTTACTGCGG 
CCGCAGOATG 
GAACATATTT 
GCTCAGCATT 
A.GCCGGAAAC 
TTCATCTGGT 
ATOtXStCAG* 
ACAOTTTCTC 
TTCTGATGTT 
C-SCTOSTTTT 
CGGCCC0CAG 
TGGATTATCA 
AAAAAATGAA 
GOGTl'ACTTT 
ACTGCCGAAG 
GCACAACAGG 
ATTACGGCAT 
TTTATTCCAT 
ATTGATGGCG 
VZAMTM7C 
CTGCCGSTTC 
GGGCAATCAC 
CATGGAAAAT 

ACCCCCACAG 



<SCTGGGTGAT 
GGCGOOCAAC 
AG?AAG<?TTGC 
AACCCOSAAT 
ATCCTTCCAT 
CGCTGCTCAC 
TCTOGTTATA 
CCCM3TTDGG 
CGTTGCTACr 
CTGTOGATGA 
ATCQTCTGfSA 
CGATGTCACT 
CWAAGGGOT 
GOTGGGCAGC 
ATTCCGCAGA 
TTCA^OCSTGA 
AAATACGCGt? 
CTCAGGCTCA 
TGOGCGGCAA 
TGATOGCAGA 
GTGGGGCCT15 
TCGCAGAAAT 
CCGrCTCGTT 
AACTCACGCA 
A&CTCAGTAA 
ATACCCCGGA 
OjCTGGTTGG 

TG'TrSGATTT 
GTACCCTGAC 
TGAGCGATAT 
GSCAJ&tSCTCC 
CTTGAAATAC 
AATGCCGTCG 
GTOGTCTGGT 
GGATGG3GTG 
ACTATACGGG 
ACAGCCAAGG 
AGCroCCTAC 
ACTG2CAGCC 
ATGCTTTAGT 
AJUtCCAGAAT 
AjCTATCACTA 
CAG GTGTTA C 
CGdTriTl'C 
ATTTGATTAC 
AAACAATGTG 
CCGCTATGAA 
TCAT CA3CT G 
CCGTCTTATT 
ACTOGCCCAT 
ADoTGTTAAT 
CA05TT5CAG 
ATCAGGATAC 
GAACGAATCC 
AAAGCGOGAT 
CAOGGTOACG 
TATCCGCTCT 
CTGGGCTCCC 
CGGCAGGATG 
CCGGCAAAAA 
ATCTGG7GGA 
TTGGTCAGCC 
ACTGTATATG 



^AQCCGCSAGG 
CACACTUTGC 
ACTGAACCCC 
CAACCGATTA 
GAOSGGCAAC 
CAGTATGOTA 
CCGCTTCCXXJ 
CACCTCTCTG 
AGAGCAGOGT 
AGTGGATGCT 
AAAATACCAG 
GTTTGA TgC G 
GGCTGACTTA 
ACTGCGTXtCT 
CAAAATCAGC 
TAATGCTGAC 
COAAGCAGCA 
GTTAGAGCTG 
OCTGAGTGCT 
GGAAGOSCTG 
QAACGGTACG 
GGAAAAAGTC 
GGCACAGTTC 
ATICCTGCG? 
CCGCCAGATA 
AAGCTTTOGC 
TCCGTATGAA 
TTGCAGTGCT' 
CAAPGA TTC C 
GTTGAGTTTC 
CATTCTGCAT 
TGAGGGAGCC 
CGTrCATTGCC 
GWCGGAAGG 
GCCGGT5CTA 
GCAATGTGGG 
ACAAGATGAG 
GCAACCAGAG 
TGXTACCCGC 
ACAOCAGAGA 
GCACCTATTTC 
TOCCCQCTDG 
TCGGGCAGAA 
GGCCCACCGT 
GOG3TAAAAT 
GGTGAGCGCT 
TCTGA AAAGA 
TATGGGTTJXJ 
AAAGCGCTGG 
CTG<SAT^ATG 
GAAACGGACO 
CATO3CC7TGA 
CCATACCAAT 
TCAGAAAGCC 
GGTTACCTAT 
GTTGTTGGAC 
GGCfCTACCAC 
GCCAATCGAA 
TGACTTAGCG 
GGATCGCGCT 
TAAGCGTCAT 
AGTTACGSCA 
TCTCATOATA 
GTAGAOCTAA 



attacgocag 
aagcgggcta 
ggaacgctaa 

CTGOCAAACC 
CGTT ATC GOT 
CAGCCTTCTC 
GTGATCCTGG 
CTCAGTATGG 
ATG5AACTGG 
GATATT5CTG 

cagctgtatg 
gcggcaggtc 
gttccaaaw 
tccgcctcco 
cgttcggaag 
ggtgaagtca 
cagatgcagg 

TTACAG03W 
AtT CT ATT A C C 
CGCCGCGAGC 
ACTK5GGGGTT 
TGGCTGGAGC 
TATCAGGCCT 
GAAGGGAAftG 
GAAGCCTCAG 
AATAOCCGTC 
GATATCOGGG 
ATTGCTCTCT 
CGTTATCTG C 
COGGATG05A 
ATCCGCTATA 
TGTTTAAGGA 
CTCTCGGGGC 
GGAGCCTCAT 
TGACT GAAT T 
gttogtttta 
TatCtcgGgc 
caacgcaccg 

TATC^.GTC'CC 
CGTGAGGAAG 
GSTAAGCATH 
CTGATGGAGG 
GACQATCTTG 
TATCCTGGCA 
CA'GGTATCCC 
TATCTTCX? CT 
A7GTGTCTGA 
AAATTCGAAC 
CAGGGGAAAA 
ACGTGAACAA 
GTACGCCAGT 
ATCTGAACTG 
TGOtTGATTT 
TGGTGGTACC 
GAGGAGGCOA 
AT"CAATGGTG 
ACCATGTCAC 
TATT TCCATC 
CTTATCGGGC 
CAGGATCTPTA 
CTTGTCGCAr 
AATAGCGTOC 
ACAGGCTTCL" 
ATGGCTCAGG 
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15 061 CACCACCCGA TTTOtTTTAT GCCDGCAATA CTTACCTTGA AC«TTATGOC AATGAAA0CG 

15121 GCAATCATTC TGCTOAACCT CAGCCTArTG ATClOCCGGA TSGWTACGT TTTCATOATA 

15161 CTTGTOSGTT ACAAAttlGCG GATACACAAG GAFIAMOAC TGCCAGCATT ATTTTGAjCO* 

IS 241 TCCCCCATAT 6JUGGTOCAG CACTGGOTAT TGGATATCAC CATAlTCAAS FCTTGQCTOC 

i5301 TGAATGCCGT CAATAACAAT ATGGGAACAG AAACCAOGCT GTATTATCGC AGCTCTSCCC 

15361 AGTTCTGGCT GGATGAGAAA TTACAGGCTT CTGAATCCM QATGACGGTC GTCAGCTACT 

154 21 TACCGTTCCC GGTGCA7GTC TTGTGGCGCA OKAAGTGCT QGATQAAATT TCCGGTAACC 

154 61 GATtGACCAQ CCATTATCAT TACTCACATS GTCCCTGGGA TGGTCTlGGAA CGGSAQ3TTC 

15541 GTGGTTTTGG SCGOTTGACG CAAAC3VSATA TTGATOCACG GGCGAGTCOG ACACAGGGGA 

1S601 CACATGCTGA ACCACCGGCA CCTTCGCOCA CGGTTAATTG GTADWCACT GCCGTACQGG 

IS 661 AAGTOGATAT TCTTCTSCCC ACGGJutTATT gGCAGGGGGA TCAACAGGCA TTTOCCCATT 

LS721 1TAOCCCAOG CTTTACCCGT TATGACGAAA AATCCGGTOG TGATATSAOS GTCAOGCCGA 

15731 GCGAACAGGA AGAATACTGG TTACATCGAG CCTTAAAAGG AtAACGTTTA CGCAGTOAGC 

l£ I TCTATGSGGA TGATGATTCT ATaCTGGCCG GTACGCCTTA TTCA&TQGAT GAATCCTCCA 

15S»t>l CCCAAGTACG TTTGTTACCG CTCATQCTAT CGGACGTGOC TOCOGTACTO GTTTCGGTGG 

159C1 CCGAATCCCG CCAATACCGA TATGAA GGGS TTGrtTACCGA TTCCACAGTG CAGdCAAAAG 

16021 ATTGTCrrtA AATATGATCC GTTAGGATTT CCCCAGGACA ATCTTGAGAT TGCCTATTCG 
AGACGTCCAC AGCCTGAGlT CTCGCCTTAT CCGGIATACCC TGCCCGAAAC ACTTTTCACC 

lfcl<31 AG C AGTTTDG ACGAACAGCA GATGTTCCTT CGTCTGACAC GCCAGCGTT7 TTCTTfcTCAC 

162 !Jl CAT1TGAATC ATGATGATAA TACGTGGATC ACACKfGCTTA TGGATACCTC ACG£AGTC?AC 
16261 GCACGTATTT ATCAAGCCGA TAAAGTGCLG GAUGOTOGAT TTTCCCTTGA ATGSTTTTCT 
1*321 CCCACAGGTC CAGGAGCATT GTTGTTGCCT GATGCCGCAfT CCGATTATCT GGGACATCAG 

163 El CGTGTAGCAT ATACCGGTCC AGAAGAGCAA CCTGCTATTC CTCCGCTGGT GGCATACATT 
LS*£1 GAAACCGCAG AGTTTGATGA AOGATOCTTG GCCGCTTTTG AGGAGGTGAT GGATTSAGCAG 
1S C .D2 GAGCTGACAA AACaGChSAA TSATGCO&Gr TG G/hATA P3G GAAAAGTGCC GTTCAGTGAA 
16S£1 AAGACAGATT TCCATGTCTG GGTGGGACAA AAGGAATTrA CAGAATATGC dGGTGCAGAC 
16621 G & ATTCT AT C &GCCATTGGT GCAACGGGA* ACCAAGCTTA CAGtfTCAAAC GACAGTGACG 

TGGGATAGCC ATTACTGTGT TATCACCGC^ ACAGAGGATG CGGCTGGOCT GCGTATCCAA 

16741 GCG»TTAOG ATTATCGATT TATGGTTGCC C AT A^CACC A CAGATATCAA TGATAAC-TAT 

16aDl CACArCGTGA cOtTTGATGC ACTGGIKACG GTAA"CAGCT TCCGTrTCTG GGGGACTCAA 

163 61 AACGGTGAAA AACAAGGATA TACCCCTGCC G^J^T^AAA CTGTCCCCrr TATTGTCCCC 

16921 ACAACX5GTGG ATGATSCTC^f O^CATTGAAA LCZG'J^/lT^C CTCTTGCAGQ GCTGATGGTT 

L65S1 TAT-SCrCCTC TGAGCTGGAT GGTTCA^SCv «:;rTTTT3TA ATGATOGG&A GCT TTAT GGA 

17 0^1 GAG^TGAAAC CGGCTGGGAr CATCftlTTGA^ v^^STTATC TCCTCTCGCT TGCTTTTCGC 

17101 CGCTGGCATC AAAATAACCC TGCCOCT^CC ATStO^C AAGTCAATTtC: ACAGAACCCA 

17161 CCCCATGTAC TGAGTGTGAT CACCGACCGC TATG^TGCCG ATCCGGmACA ACAATTACDT 

17221 CAAACGTTTA CGTTTAGTOA TGGTTTTGGG Z^AAJ.tCTTA CAAACAGCCG TAC5CCATX5A 

172E1 AAGTOGTGAA GCC7GGGTAC CTGftTGAGT^, TGG^CCCAAT GTQRCTGAAA ATCAAGGTOC 

17341 CCCTOAAACG WCGATTACA AATTTCCCOT TGGGCAATTT CCOSGACGTA CAGAATATTA 

174 CM AOGGCAAAAG GCAAAGCCCC TGCCTTACGT TTCA7wVCCGT ATTCCTGAAJi TAATTTGGGC 

174 61 AACTATGTCA ACTTGACCAA AAAATGCCCG- GCAOG^TATG TAT^COTATA CC^TTACTA 

17S21 TGATCCGTTG GGGCGTCAAT ATCAGGTTAT CACGCCAAAG GCGGGTTGCG TCGATtrCTT* 

17SB1 TTCACTCCCT GGTTTTGTOGT GAATGAAGTT GAAAATGACA CTCCCGGTGA hTGtiCAGCAT 

17641 AAAGCTCAGT GATGCCTGTT CACTCAACAG ACATCACTCC" ATTTA<3GAAT GAATCATCAA 

1770: GAATTTCGTT cacagcaata cgccatccct caccctactg gacaaccgtg gtca-sacagt 

17761 ACGCQAAATA GCCTGGTATC GGCACCCCT5A Tx\CACCTCAG GTAACOGATG AACXJCATCAC 

17^21 CGGTTATCAA TATGATGCTC AAGGATCTCT GACTCAGAGT ATTGATGOGC GATTTTATCA 

17ft 01 ACX5CCAGCAG ACAGCGAGT6 ACAAGAACGC C^TTACACCC AATCTTATTC TCTTGTCATC 

175/41 ACTCAGTAAG AAGGCATTGC GTAOGCAAAG TGTGGATGCC GGAACCCGTG TCGCCCTGCA 

1^001 TCATGTXGCC TTTTAGCTCT CAGCGCCAAT GGCGTTAGCC GAACGTTTCA 

ie061 GTATGAAAGT GATAACCTTC OQGGADSATT GCTaACGATT ACCGAGCAGG TAAAAGGAGA 

IB 121 GAACGCCTGT ATCACGGAGC GATTCATTTG GTCAGGAAAT ACGCCGGCAG AAAAAGGCAA 

1S1B1 rrAATTXGSCC GOCCAGTOCG TGGTCCATTA TG^TCCCACC GGAaT-SSlATC AAACCAACAG 

aa^^i CATATTGTTA accagcatac ccttotccat cacacagcaa ttagtgaaag atgacagoga 

IB 301 AjQCCGATTvG CACGGTATGG ATCAATTTGG CTTGGAAAAAC GCSCTOGCOC COGAAAGCTT 

183S1 CAC7-TCTGTC AGCACAACGG ATCCTACGGG CACGGTATTA ACGAGTACAC3 ATGCTGCCGG 

IB ^21 AAACAAGCAA CQTATOGCCT A7GATGTGGC CGGTCTGCTT CAAGGCAGTT OOTTOGCGCT 

1B4 61 GAAGG5GAAA CAAOAACAAG TTATCBTGAA AT^CCTGAC^ TATTCGGCTC CCAGCCAGAA 

1B541 GCTACGGGAG GAACATOGTA ACGUGATAGT GACTACATAT ACCTATGAAC CCGAGACOCA 

l^bOl ACGAGTTATT GGCATAAAAA CAGAACOTCC rTCCGGTCAT GCCGCTG03G AGAAAATTTT 

IS 661 ACAAAACCTG COTTATGAAT ATGATCCTGT CGGAAA7X37X3 CTGAAATCAA CTAATGATGC 

1S721 TCAAATTACC CGOTTTTOGC ^CAAOCAGAA AATTGTA^G GAAAATACTT ACA""TATX3A 

i«7£l CAGCCTSTAC CAGCTGGTTT CCGTCACTOG GC^TGAAAl^ GO0AATATTG ^CCGACAAAA 

13641 AAA CCAGTTA CCCATCCCCG CTCTGAT^^ T^iACAATACT TA7ACGAATT ACTCTCOCAC 
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Fig.2. 

1B3D1 TTACGACTAT CATOSTOSGG GJWTCTGACC A9AATMCAT AATTCACGAT CACCGGTAAT 

ie9£l AACTATACAA CGAACATGAC OGTTTCAGAT CACAGCAACC GGGCTGTACT OGAAGAGCTQ 

19022 GCGGAA5ATC O^ACXCAGGT GGATATGTTG TTCACC CCCG GcSGGCATCA GACCCGGCTI 

19001 GTTCCCGGTC AGGATCTTTT ClXWACACCC CGTOACSAAT TOCAACAAGT GATATTGCTTC 

19 1*1 AATAGGGAAA ATACGACGCC TGATCAGGAA TTCTACCGTT ATCUlTGCAGA CAfiTCAGCGT 

19201 GTCMTAAGA CTCATATTCA GAAGACAGGT AACAGTCAGC AAATAGAGCG AACATTWTAT 

13261 TTGCCAGAGC TSGAATCGCG CACGACATAT AGOGGCAATA CATTaAAAOA GTTTTTCCAG 

19321 C?TCATCACTG TCGSTGAAGC GGGTCA&SCA CAAGTGTOSS TOCTCCATTG GflAAACAGGC 

1MB1 AAACOGGCGG ATATCASCAA TGATCAGCTG CGCTACACTT ATGGCAACCT GATTGGCAGT 

19441 AGCGGGCTG6 AATO3GGACA GTGACGGGCA GATCATTAGT CAG^AAGAAT ATTACCrCTA 

19501 TGGGGG&ACC GCCGTSTGGG CACCCOAAAT CAgTCAGAAG CTaATTAGAC AAGCCGGCGT 

13561 TATTCTGGCA AAGAGCGGGA TGCAACAjGGG TTOTATTACT *OGGCTATOG TTATtATCAA 

19623 TCGTGGACAG GGCGATGGTT GAGTGTAGAT CCTG CCGGTC AGGCCGATGG TCTCAATTTC 

l&Gfcl TTCCGAATGT GCAGGAATAA CCCCAlWTT TTTTCTGATT CTSATOGTCG TTTCCCCGGT 

J? 74} TTGCCTGGAT A&3GAAAAAA GCGTATCGAA AGQCAGTCAA CATCACGACA 

15 601 GAACACCTCC TTGAACAAGG CCCTTCCTTT GATACCXTCT TGAAATTAAA CCOAGGATTG 

2$B£l CGAACGTTTG TTTTG&GTGT GGGOSTACAA GTCTGGSGGr GAAGCGGCCA CGATTGCaGC 

19521 AGCGTCGCCT TGGGGGATCG TCGGC5GCTGC CATTGGTGGT TTTGTCTCCG GGGCSGTGAT 

19 931 GG£G*TTTTTC GCGAACAACA TCTCAGAAAA AATTC03CAA GTTTTAAJGTT ATCTG JiCK 

2 0 Oil XAAACGTtCT GCTCCTGTTC AGGTAGGCGC TTTTGTTGTC ACATDGCTTG TGACGTCTGC 

201 Gl AC7ATTTAAC AGCTCTTCGA CAGGTACCGC CATTTCDGCA GCAA£AGOGC TCACCGTTGC 
2 0161 Jl G QiATTAATG GCTTTAGCC-S GAGAACATAA CACGGGCATO GCTATCA-GTA TTGCCACACC 
2 02 2 J CGCCGGACAA ACTACGCTGE ATAOGCTCAG GCC. QGQ T A_ZlT GTCAGCGCGC CAGAGCG&TT 

2 02 & I AGGGCACTAT CAGGCGCAAT TA77GGCGGC ATATTACTTG GCCGCCATCA GGGAAGTTCT 

GAGCTGGGTG AACGGG CAG C -5ATTGGTOCT ATCTATGCTC CTCGATGGGG AAGGATCATT 

2D401 GGl'AATCTAT GQGATGGC^C TTATCGG TTT ATCGGCAGGT TACTSCTCAG AAGAGGCATT 

204 61 AGCTCTGCCA TTTCCCACGC TGTCAGTTCC AGGAGCTGGT TTGGCCGA/iT GATAGGAGAA 

2052 1 AGTCTCGGGA GAAATATTTC TGAAGTATTA TTACCTTATA GCOGTACACC CGGTCAATGG 

GTTGGTGCAG CCATTCGCGG GACAGemCG GrCGCTCATC ATCCOTTTGG AGGGGAAGTT 

206*1 GCCAATGCCG CTJlGCCOSGT TACCTGGAGC GGCTfTAAGC OGGCTTTTAA TAACTTCTTC 

20701 TTTAAC5CCT CTGCACGTO. TAATGAATCC GAAGCATAAC AATCATGTTC ATTCOCACTT 

2D" 61 TC:CATGGAT GACAAGGT^G GTTTTTCGGa TGT^TG^ACA GAGACCCGTA CAE5GGTCTCT 

20 £21 GTCCAGTTAft TTTTTGGATG AAGAACCAA.T GGTGTAACCli ATATGCAAAA TGATATCGCT 

C^G^rTGAGt AATAAGCTTT TCTGTTT^.CC AC7CATACC3 GGAAAAOTGA GGGTTWITCT 

205-il G^CTGTATCG G CCACAGGAA GCCCTTLV^ TGGCAGGTAEZ TTAGCATCftT TGAAATCCAT 

1 a CO 1 CTGGAATTGA C CACTGTCAT TCATC CCA~ TGAGATCA CA ATCGCTTTGC AGCCACGTGG 
21D^I CATCATTG T A CTGCCGGC7iT AACTCACT/.r TCCCCGG^CA TCCTG^TAAG GCCCTAAAAG 
21121 GGCAGGTAAC GTCACACTGA TTTGTTTGAT ACCGCGtGTA TTACCTAAAC CGTCAGGATA 
211 Bl ATCGCTTAGCA ATATTCAGAT CCGATAATTT' GA-GGCTGGCT TGCAGTTGTG TCCCTTCGAC 

2 1241 OTTCAAACCG TTAAGCGTTC TGCC^r^CT GCCTTCACrr -G CATTG ACTA ft LTCAGTCAC 
213 01 TTTATCTTTT AAAATOAAAC TJi TTTt CtOT Qa£ACCAG£A TaCaCTTCAG CCAGAGAAAC 
2lJbl GGTTCT G GTG ACCTCCaGT£ CCCGTTCATC TTJTTCCAATi TAGCTTTTTT CCATCT-STGC 

21421 TAAATTCAGC atcagggttt cacccgctaa TAAaCCCGCA taa-stcccat gccaagcacc 

2/4 61 TGGTTTAATA 7lA-GtST\5C7'G CCGCATTATT GAATTCATfeC TGATAAGTTT GCTCT-5CCAT 

2 15 41 T^lACAGAGT GA'GACCGCCA AATCATAJUlA CTGATAATAA ATAGDGGACft ACGTTCCACG 

21601 GAGCCAGTTG TATAGCGCJG CATTACTGAA TTTACTTTGC AGAAAGGCTA ACTGCGCCTG 

216*l AGrrrtrrc cc tgctoagttt ccagatagtt TTTTTCTAAT actcccgctt cacgadgtac 

21721 AGC'dAGCGTC GCTJtATT-GAG TATCAArrTG TTTTATCTCA GCTTCCGCAT TATTGTOCTG 

217S1 AATTTCCCAC TCTTGC03AC GGCGACGGTA TATTTCTCAT TOGC7GATTT TCTC^G'TOGC 

il$41 AA7ACGH5TT GCTOACGC^G AAATTTCGAT AC'ZAATCGCA CTGGCATTGA AAAGCGCCCC 

21901 AAAACGGGAA CCTCCCACAO CAAAACCGTA AATATTGGGG ATOAGATCTS CCGCGGCGGC 

215-bl GGCCATATGC AOTGCTOTCC CSCTCGTGCT CAAGACCGAT GAAGAGAGGT AAAGATGC1AT 

22021 C5CTTCTTTT TCACCAGCGT TAACATCTTC GTCGTACAGC GTATTGAAAC TOTCAAAACG 

22D&1 AGACTCTGCA CCAT^ACGGC TTTCTTGA^G CGCCAATTTA TCAGCATCAA TTTCAGCCAT 

2214 1 GACCTTATCC TGCATTTTAA TACTTTGCAG GGCTAACTtA C7TGCCTTTGAG TTTSCAGTAT 

22201 TTCASCCAAG GCTTCTGCAT CCTGCCGTTC AGTAATGCTG AG CAGGGTAT TGCCAAA.TTG 

222 &1 TATCAACTGG CTTACCCCCC ACTTGGCATT TTCCAGAATC ACCGSAAAAC GGTACATCGG 

22321 CATCACTGCA TGAGGTAAAT CGOCGCCGCC TTGTGAAGCA GTOATOGCAO CACTGAGTAA 

22 3^1 CATGGACGGA TCTCOTGGCG TGGCATAGAG AGATXATGAC AGTCSCMGAC TOTCGATTCT 

224'il CA3GTTATGG CGTAAGTTAT AGAOOTOTTG CCTCAATGTC TGCCAGTAAC CTTGCAGTTT 

22501 TTTATTAATT TGAGGGAGGA ACAATGCGOT TAAOGAAATT TGCOGTACGT TTC^TGGQTA 

725 r/i AT3CA-GOSCG CTOACGCJiGT TGCAGCATTT T^TGTTGAT> ATtJATCCCGC ATTGTTTGGC 

22621 TGSCAGCTTC TTCCAGGCGT GGCTCTT5ACC ^TTOTTATC -CAAT^AAAAA TAA3GCTCAT 

2265 'j CACCCAATAA AGTGAGCGCC TGTACATACC ^C"ATTTTASC TTCGTTTAAG GTATCACGTT 
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GAAGCTOgCG ATAOGCGCTA TCTCOGOGGG TAATCAACAA ATCCACCATT TTCATAAAOG 
TAGCCACTTT ATACTGCATC GGATCAT&CT GGGCAACGGC G7CCGGATCG ACOOAATCCA 
GCGGATTGGC ATTCCAGGAC GlArCTTCCT CCAATCGSOG GACGTTCCAG TAATAATCCT 
GCATTTCACC CTOAACCGAA TATCCCCTO3 GGTTCAGATA TAGCGCAGCC AGCGTGTCGA 
TCC&GTAAAA TCTOCTCTTC CAATAAGCGC TGGAATACCA tCATCOTCBT TGTAATAQAA 
CAATCCCAAG AAATAGATTO CATIGGCGCC GTTT3AAATC GATGGGWCA GTCrTATTrr 
TCATGACACG ACTTGAATAC CCCTTTTATA TTrTTTTjATA TTTTTTACTA TCCCCTGTTG 
TGTCATTOCC GAATCATGAT 03GCATCATT AGTXiAATATA AATTGATTTT TCQTCTCftTC 
AAAATAAAAG AAAGCAGATT COCAGGATTT GTCATAGATA AT*TTTTTTGT ACCCAACCCC 
TAATCreACA CCTTCACGTA TOTAATATCC TTTAGCATAO QGAACAAAGA GC5TTACT3T 
GGTTTCAATA TCAGATAACA TTCCTTDGTA ATAAGSTTST CTGGCAGAAT TGCC^VTCAAT 
ATTCCCAATA TGGATCTTAA ACCAAOGTTC ATCACCATOC TOCTCTTTAT TQTA3GGGGG 
CAACTTAAAT GTCGCATAAA ACCCTTCACC TAATTOCGGC TCTGGTAAAT TTTCCGTTTC 
GATACTTAAA ACATTATCAA TACCAATATT GGCTCTOGA GCTAATTTTC TGGAAAATAA 
AGTATTTAAC CGGGTTCTGT AA0GGCCAAT CTGCATATAT TGTGTGCCTC ATGGCATTT7 
ATCCAGTGAT ATAACGTTAC TTGTATCTTT GGATTTTAG7 TTTATATGAA TTGGOGATTC 
AATAACAATA TCGTTATAAC OGCCGTCGGG TTCCTTAATA ATAAACTCGC TCACCAGAGG 
AATATCATAG CCTXGAATAT CMCHTOC TTGATTAAAA TCAtATACCA TAGGGTCMSA 
TTOGTGTGAA &STTTAGATC CCACATGGTC WtCAjGCATTT aactc^cta gaatatcaga 
GCC ATTTT TT AATAAAAAAC TAATGTTTTT ATCTTGGATC TCTTCGATCA TAGATGAAGC 
AAjGTTTTATT ATCTGTOGCT GGTTGAACAT AAA'FACACCC: ATCGATCCTC GCGAAGGAAC 
AGTGCCGCAA TATTTCCCAT GTTATTAATG ATTGAAACAT CATT A-G TAAA TGATTCACAT 
ATAGTATGCC ATACTCCTGT GTTATCT7TC CAATCTAATA CTATGTTAGT ATCAAGTTTC 
AATTCAGCAT CAWTGATTC ATAftTCATAA TTTATACCAA CTCCAATTTC TGATTTTCTA 
GGAA'lTTTTT ^CTTOGTTCT TAGA7GCATT AACACTCTAA AATATTDGGC ATTTOTAAGA 
TCGATCGAAA TAATAAAATC CAAAGTTCCA TAATGAAAAA CTTCTTUTTC TTTTCCAAGC 
ATTTC ATGAT GTCTATCATA ATCAA&TAAA ATAACCGTTT CATC7TCTAC CATCGATAAC 
AvGTATTlAA CCTCAtTCATT ATATATATTC CCTTTTCAAA AATTAVJTTC CATTGAAGGA 
TTGAAOGTTA AATTAATATC ACCATTTCCT GCTOATATAT ACGAGAGATC AAAAATATTT 
CCGGTAAAAC TGOCTAArTT ATlrrTTTGTG. GTTATAGATT CvTTATATTC GGCCfiMTAA 
TCTGTACCAA ATTGATtGTT GACTTTGTAT TCTGI^CCTSG TATCAAGTTC TGATaATGTG 

CT 1 TT AACAA tggogtctaa atcattttct gtgagaat&g ATAATGTCAT atcagggtta 

ATG jTCATCC CTTCTCTTGC AGGAAGACTA TTAAAAGAAT AATTGTCrTT TTTCTCATGG 
AAATAAAGAA TAAT SACGT C T TTTTC ATAA TCACAAGA^C AATACATACC AATGCTGGCT 
TTTTTATTGA TGAG^TTTTC TATTTTATCA CTCACATTAA AATTAAACGG TGAGCTCCAG 
CTGCCATCAT AA^AArATG TGACAGTTTT JJ1TATATAAT CAGTGATATC SATtTTCCCA 
rCTTCACTTT CATTTTTCAG CTCTJ J'J'JGT TCCAGCCACA EjTAAATACAA ACGAGACT^ 
TAAATAACAG GTCTCATATT TTCCTGCCAT A CATTGATCG GTATTTCAAT TTTTTTCCAT 
TCTCCCCAW CAt^OGCAGC AAATTCACCG TGCTGOWCT TTTGGTGATC GACATTOCGr 
CAATAA^ATA TTCTOGGTTC TGTCTOGCTA TAAdCAATTA. AATAAGTGAG CCCCTCATTO 
ACATTAATAC TGTCATGATA TCCGCTAATC ACCTGCftAGT TAGCGAWTC TTCAA-^rGCG 
GTCAGATAAT TTTTAAAGCr ATCTTCAACG GTATCGATAT TTAACTCACT TTGGGAAAGT 
TG CTGTAACA GGTTGTTCAT CATACCTGTC TGACCAATAC GAATGGTGGG <5TCGATATAG 
TTTTCCGGAT AAlAGOC«L'S TTtA.GA.TACG CcCCCCCft&S TGCTATACCG TCGATTGTAG 
G7TTCCCACT CGCAGAAGAA CT5ACGMTT TTCACTGGCT TTGATACTTT TCMTCAACA 
TTATTCAACG CC0GSTK3AC ATATAACTGA ATGCTOGCAA TGGCTTCTGC CACACGOGTG 
GTTTTGACTT GGGCAGAAAC TTKTTATCA ATCAGCAGA7 ACCTQTACAA CtCATCCCGG 
CTCTTAATCT GTTGAGGTGC ACCATTTTTG ATSTAGTAAG CACTGG-CDGC TGTCGTQGTG 
GCTTCATC5CA OCCATGCCTO AAGCTGGTCG GATTGTTCAC ITjTTtlAGTCC CQCCTGCAAC 
AAAGTACTO5 CSGCTTGCCA ATOiTCIPlAAT GTTGGCATCG TTCACOSACa 

TArrrrAATT- ttatgagtoc agcaacacca tccggggtaa tacccaatgt agcagcgaca 

TCCAGCCATT OCAGAGTOAC ATCTATAAGT TCTCCAGTTG GTAAAGGTAT ?CnCTZ:ZCAA 

ACCGGTCTGT TGCAATGCTT GTC^CftCAAC CTGiAGCATCA AAATTTTAAC GCCACC5CCA 

AATTC5TTC&3 CAGTGAACGC TWTAftGTTC CAAATGCT2T TAA-SATTCTQ TC0CG7AGCT 

TCACAACGCA TGATCACAGC ATGGAftGCGG GTCAGCGCTT GCAAAGTGGG GAGATCATGT 

TGCAGTGCTC TSGTTTCTGA TTOGAATTTC TCACCAACAO GGTCAGTTCG 

TTTT03CTGA GTCCAATATT GCGCACAATC AGAGAAAGTT GCCCCAGTAC CTGACAAAAA 

C3CCACCATGT TGCTOGTTTC ATTCTCTGAG CGATC^XCGGT TAGCOGCAAT AATCATGAAA 

TCATCGAATG TCAGTCC7JTG TGGTTTTATC TGATTAATCC ACAGCAAAAT AJSTTTCTGCT 

GrtrrrT^acTG aatccatttg aatgctggca gc^atcagcg gggcaqctcc ad&satcagt 

TCGTCATCAC TOAGTC1AAAG TGTTGATAAT CCATTACTTA GTGTD5TGAT AAGGTTTTCA 
ATATCCGGCO TAAGGACAGT GCTGTAATTA TCCGTCGTC^ TCAGAAAC^C ATWCTCACA 
CRCZJLTrTC*? GTG TTGTCAC CCACTCGGTG CATTGGAACA GAAAGCT^AT TAATTGCGTT 
AATSCTGTAr CAGAAAAAAG OGCAATTTTC GTGTOCACAT AGGGAGAAAC OGACAACAAC 
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TTTCTTCGCA 
ACCCTTX^CTT 

TATACTCAGG 
TCTTTTCflTG 
GAACTCTATG 
rJA-TCOGGAA 
rAGTGGTTCT 
AACTCACAAA 
TGTGAAAACT 
GaGGTTTTTT 
TmACPG CAG 
TTTTTTATTT 
ACAGWTTAT 
GAAATTATTA 
TATAAAGOM 

cDGGTrrrre 

GCCAC ATAAA 
G TTCA ATTTT 
TCTTTGCCAG 
ACGT5ATGGT 
AATCTG AGTT 
AAT7GGCAAG 
TGAAGCACCA 
AGACCGGGTG 
TCATACTCAG 
CGGCAAAAGT 



ATGTCT0CCA 
TCTGQATTTG 
CTCAAAC?CAC 
TCCAGTtTOG 
ATTGTAAAAT 
ACCASCGCAT 
TTACCCAACA 
ATATCTTCTC 
AGGAATATAT 
ACGGATTTEG 
CAATTTAATG 
TATATATTCT 
TACACXGAAA 
AT5TGAATCG 
CCCGTCATTS 
ACTOAAGGAG 
CAAA<3AAGCA 
TTCTGrTGAA 
ATCAGTCTTA 
GCTAAATCCG 
tA<5ATGATAG 
ATAGTCAGTT 
TGTGGAAATT 
TCCTTTATTC 
TTCTGTTAAA 
GTGGGCGCGG 
AAGGTTAGTG 
TTTATTCTGT 
CAGAGTTCCA 
TAATAAAGTA 
ATATAAATCC 
TAATCTGATA 
AATAAAAAAT 
AAAATTAAAT 
TTAATTGTOT 
TATTC5AGGAT 
AAAGAAATCC 
AGTAATGTCA 
TAATCAAAGA 
TTGGTGCTTT 
GT<?GATTTAA 

AATTCGCGGC 
AGCTATCTXSG 

GTTCGTATCG 
ATCMCATAG 
GCTTTCTO^G 
T3GTCACT3A 
AATAGAGGAA 
A'STOCTTTCA 
TGTCA^ATGT 
ATTTPlTTTAA 

ctgttgtaag 
ttggcagcao 

ATTCAACCAC 
CCAAACCCTG 
TTTG^CATTA 
CTITCTCAAG 
GTAG TGTTTT 
GTCTG 

TTTCTCTTGJi 
GGTCAGGCAA 
AGCGATAGAG 



GCAGACGAAC 
TTCCGCCATT 
GnTCATTAl' 
TATTATCAGC 
&ACTGGGTTG 
OGC7GACGCT 
CATTGCTGTC 
QAGATATGCC 
CACCOOGAAC 
TTAACTCGCC 
fiGAATACTGT 
TTATCTCCAT 
TTRTATTTGT 
TAJOAAATCTT 
ACCAATOTTA 
AGATTOACAA 
TAACTCTtAA 

ttttccggtg 
atagccagta 

GTGAGGTTrr 
GGTCTCATGC 
TTCTCTAACC 
TCTTCATCCA 
ACGTGATATT 
TAAGCCOCTQ 
CCATAAAACC 
ATA-TOAGCCA 
TTTGATTOCJ' 
CTT^rCCTAT 
TCACCAGCCT 
TTAAGTTATT 
6GAATATTAT 
AAAGAACTTC 
TTTAACAAAC 
TTGTGCTGVT 
□TCTTGCTTG 
TTAATAAAGT 
CATAATTTAT 
GAAAGCTATG 
GGCACAGCAG 
AGGTCCAACT 
GGTTGrTTCTC 
CGCATAATAC 
AACCCCTGTT 
ACATTCATGA 
AAGGTGAAGT 
TGAAArAACT 
AGTOGGAAGT 
TTTGGTT5CAT 
TGTCACCCTG 
TATGTTATTA 
TCOTAATATT 
ATAAGTTTTC 
GA AGQ3GT TT 
TO3TTTO3TG 
GGGCAATAGG 
ACCTTCCGGC 
AGATA.TTGAT 
CTACCGACAT 
T tjGTT CTTTC 
ATTTTCGSGr 
gGCAT TGGAT 
TTTTA05CnT 
OGTACTTTCA 



GDSATAAAGC 
AGCCAGTTTC 
TCCCAAATAA 
AGAAAACTCT 
TTGTTTAGrtTG 
AATATTATAG 
AATG OTTOAG 
TCTGGCTTTO 
TCCATCATTt 
ATAAGCGGAG 
AATGGGTATT 
ItTTOg AGACG 
A7TCATTTTC 

tgotggttcg 
tcajsttgctc 

ATAAACTGAG 

AAATCAGTAC 

TAATTKTrrC 

GCGATGTTGO 

TATCTTGCAA 

DSGTTTGOCG 

TCTGAATATT 

T A TT ATX C TG 

GAGAaTTGTC 

GTCAAAATAT 

AACTTGTTOTA 

GTTGTGGATT 

GATGAGCGTC 

GACGAAATTC 

TTTTCTATTT 

GtCAATTTAA 

GGTTAATTAA 

CCTATAATAC 

TTTCATC? AAA 

TGAAAAATGA 

GTTTCAGGGG 

TGQ5TAATTA 

TGAATACCC0 

AAAAAAAGAC 

GGTGGCTTOG 

CCCAACCTGA 

&AAGGAAACA 

GACTCACTAT 

ACGCCTGAAT 

TAAGCGCTCG 

&3ACAAAI3AC 

CAAGCACTTT 

TTAACCGTAG 

AASTTATCAA 

TAGGTGAATA 

AATTGAAAAA 

ATTAA i v 

ATAATTOTGA 

ccatgaagat 
tcacggtttc 

TCGG TAAAAA 
TCTTATGAAT 
TGTTAATTTC 
AAGTGATTTC 
CGGGGGTAAC 
TCAGGTTGAT 
AACGTCTCCC 
GGTGCCGCCC 
GAQAGCGATG 
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J 0421 TATTCAGTAG CTCA3TGATT TTAAG7GTAA TAAGCTCAAG GWCCATGST OAAeOGAGAT 

304 &i AGCCAAAAT3 TTGTGCJCGAG TBCTOTAATA AGAAAGAAAT GJtCTCTGJUG AGCOGAGCTm 

3 054L AGTTCCAOAT GGCAGGCCTT CCCGCCGGGA GGCTmAAC TCCTTCCAAC CCGTATAATfc 

30601 TTAACCAATT TACCCAACGA TOAAOMAAG AACGTGAACA GTGAAGCGTT CTCGAAACGT 

3 0661 GAGAAACOTT ACTCCCTTCA TGTAACATCA AGftGCGOGGT 6AAGCGAC3T OCATAGTCCT 

3 TATCCCGGGT TTTCTGGATA GCTTTTTTCA TCGGACOT03 TTGATTTC50g WJTATTOATG 

30761 TTATOATTOG CATGACTCAG TCCATTTTGG GAlTJGTTrr SAmtMOSA TTAATCAGAT 

30441 CGDSAAAATC GGACTGADTT CCCTTCAAGT GATCTACTAT TTTGAAATCT TMTTAATCA 

30 501 GGAGTCAGCA AAOOAGTTAT TCCCCATAAT ACCTGACCAT GTGCTTOTTT ATCCGGS&AA 

30961 TGATTCATCT ACOGGTGGTA 'tgTgSATTCC TTOSTOCCAT AGTCAGAAftG ATJLT7GACTC 

3x021 T&gcCATTAT ATCAAAGTTA CTTTCAGTAA AAAGGACGCr GCTGATATTG TGAACTACAT 

21061 GTTTCAACAT GGCAGTTATG TTTATTTTAC AGACAGTAGT AAACftATTTA GCAftTAAOCA 

3 J- 14 J AArTATGTCT GGTGATTCAG CTAAAGGCAA AGGGGATTAT AAQCTTEAAA TTAAAACAAA 

312 Dl CGGGAACCTT CCACTGATGG TATTGAATAA ATATTGATTC ATTATTAlTT ATCGATAAGA 

31261 AATTAA ffTTT ATATTTCATC TG<5TTTCTGC AATTAAGTTr TAAAAATTAA TTCTACTTTT 

31321 TTTATGGTTT TATATTTAAT GCCAATCATA ff^TrrrJCT TATAATAATT GA1AG7TTAT 

313B1 TTATATAGTA AATAAATTCT GTTOGATGTG ATTATTAT7G TGAGACGGTA ATAATTAACA 

31441 TAACASAAWi TTCATGGTTA GGAAATTCAJL TCd UTlTO TCOSGTTTCC TGACCATGAA 

31501 GACCTGTATT TACTCTAGAA CTCGCATTGA TACTGGATTG ATTAGCCGGA CGAGTGTTCG 

315&1 GTCAGCAGAT AATATCTTGX ATATTGGCTG TGGATTTTTC AGCGAjGATGA TAGC7TTGGC 

31b21 AGTAAAGGTO A7TTAATAACC GATAAAACAG AGAGAOGGAT TGTCGCCAGG AAAGCAAAAA 

31&B1 AG^rCACCA TGACG03TTA TTCAAACATT TTTTAACCCA ACCAGrAAACC CCCCGGGAAT 

31741 TlTTATCCCT TTATCTGCCG GAAGDSATCC GGTCAGTGTG TGATTTACCA CACTAAAACT 

3lfiDl GG.-ACvSGCA GCTTTGTGGA CAGGCAATTA C3TCAGTTSC ACAGTOATGT GCTGTATTCT 

31B61 GTCGAGACAA CCCACGGSGA CGGTTAQATT TATTCCCTGA T7GAACACCA GTCCACGCCT 

*152l GATCCGTT&A TGOCCTOGOS GCTGATGTAT T/lTTCGCTGT CACCCATOSC TGCGCA7CTG 

31SB1 AAAAAAGGAC ATACTGAACT CCCTTTGGTC G1CCCCCTC-C t S ' ITIT ATC* TCGTOAGGTG 

32041 AGGCCTTACC CTTACTCAAA TTOATSSCTG G^TTGTTTTA CAOTCTCTGA A^CCCGGtT 

32101 CAC^TCTATA ATCWCCCCT GCCGTTGGTG GATATCAGTG CGCTCAGTGA TGAAGAGATC 

121 bi CTGACACATA AAAGCATTGC CTTGATGGAC; CTyGTACAAA AACATATOCG TTOCCGGGAT 

3 2221 ATGCTGGA3T GGGTTCCCCA ATTGGTGGCG TTCn^AAT^ COSGTTATAA TAGTOCCCAA 

32241 CAGCGCCATG TTGTGTTAAG CTATATTTTA CTG^TCGac: A TAOG CTDG A TCTC^CCCAG 

3*241 TTT5TCCATC AACTCACTCA AOUTCTCr-3 OA3CATGAAA CCATCTTGAT GAOTATTGCA 

32401 GAAGAGCTTC AACAAAAAjGG <3CGTSAGCA/i GCCCGG ACLT^G AAGGCAGAAC AGAAGGCAGA 

3 24 SI GCTGAAGGAt SOSAAGAAGG CAAGCTGGAA ACGGCGCOCG CATTAlTACG GCATO3TGTC 

32521 AGTCTGGACA TCATTGTCAC CAGTACCGGC CTGAG CCGM AGAAAATTCA AGCGTTAaAjG 
CATTAAATGG ATACGCTTTT TCACAGCADG AT7.Tr^TGAC CCCTGTGAGG CCACCGGAAfc 

326-S1 ATTTTATTTA CfACGATTTA OGA.CGGGTTA ^mTAGGAAG CTOAATOA3A CGTCCTTTGT 

32701 TATATAACW TCCCATATCA ATCTTCTdT TTCCGCGTAC AGGTAAjCTAA OZCAAAZUT? 

327 61 CGT3 AG CfcGC ATTTGCCAAC AGGCCATCAT 22TZ7,T€GCC TGACCAAGAO AAGATCCOGC 

32021 COJiTTTtCAT TT5V3GTTGCA TAAATTCCCT TZ?G CAGOC AGTCCGGGGC GTATCCAGTG 

32BB1 AAATCOlGTG ACC AgCOTC A OCATTAAAGA GTGrGTCAGi: GTCGGTTTCC GTGTCTGTCA 

i2541 CCACjTTCAAA CTGATTTTTC CCGCGTGCAA Trrt/trATTC CGCATCGTAT tggttattca 

330D1 QCAGACAOAA GAATTCCGGA GCACCTTTTT CCATCGTGCIC CAGTGGCTCT CCTCTTCTGT 

330tl TATAGCGGCG OGTTGrCAGA rCA«3CACCCA SAC^TGAACG TCCATAGTTA GCAAATCCGA 

33121 GGTGAATTTT CTCO&STTGT ACAGCTTGTG ACAGTAAAAA CJC-SGATCGCC TCATCTGCCG 

331 Bl AGTAATCCAT GTCCOGATCA GGATMC*GOT3 v5AGGAGGCTT ATCGCOGTCA ThTTCJittiTC 

3 3241 TGGGGGGATA CA-3G TTAGTA TGGTGACCGA TGTATTCTGC CCAACCGGTA CGAAAGAAGT 

33301 GGTAGGTCAT CACAAAGATA TTGTCTAAAT AA-^ST^«AT TTCTTITGAAG CTCGACTTCT 

333 61 CCATTTTCOC AACGACGGCG CTACAGGCTA TCGTGATTTC mAOGGGCC CGGSTTCCAA 

334 21 AOSC&ATCTT CAC3TCCTTCA CGCAGCTC7T TCACTAACAA AACATAGTTT GGGCCATCAT 
334A1 GTTCOSGG7C QAATTCATTA CCTTCTTCAC CTGTGGCCCC GGGGTATTCC CACTCGWAT 
33541 CCACOTCAGT AAACATGGGA AAAOGCCOGC AAGAAGTCGA CGATGCTACT GAGAAATOTA 
33601 GCACGTTGCT CA3GATCTTT G5CCATCACA GAGAAATACC CTGACATAjCT CCAGCCGCCG 
US$1 ATAGTOAATC CGAGTTCCAG CTTATOOCCT GCCTCTTTTG CTCGOSCTTT CAGATTACGC 
3 3721 AATCCCCCCA GTAAACOGGA GGCTGCATCC TGATTGTAAT ATTGCAAGAA ATTCTTCGGG 
337&1 CTTGGCATCAC GGOGCTGATC DSCGTCCAGA CCGACATTGC GTGTTfiCTCCC TAAATCACCA 
33B41 TAAGGATCAA CGGGTACAAT ATGGCCTAAT GTAATAGGGG CAAIOTSGCC ACTGCTGGCr 
33901 TCTGCTTGCC GGTTCCACCC GTCAACAACC TCATTAATCC GTTCGGATAA CTTGCCTTTG 

33 561 TCACCGTTGA DGGCCATAAA ACTCAAAATC AGGCGGTMT AGGCGGTAGG CGGOhTTTTl 
34021 TCCAGATCAA AACCACGGCC GGGGGCATCG tCGCTCGTCA GCGCAGTGTT ATCCTGwIT 
340^1 TCTGGCOACA AAOGCGCATC ATACTCGCAC CAGTCAGTAA TATAGQCAOA GACMTAC6C 
3414 1 A6CGGTTCTG TATTTTCCGG ATCAACTTCA TATTCGTTGT ACAGSGACTT GGCAACACGT 

34 2D1 GCTGAADAAT AACTCAAAGG agttccgctg cogtcaggtt tataxotcac cttctga-tag 
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3<261 
34321 

343*1 

34501 
34561 
34621 
34661 
34741 
34901 
34B61 
34921 
349*1 
35041 
35101 
35161 
35221 

35341 
35 4 01 
354 61 
35«1 
355G1 
3S&41 
357D1 
35761 
35821 
35BB1 
3S941 
360D1 
36061 
36121 
361S1 
3*241 
36*01 
3*361 
3"21 
364*1 
36E-41 
3$£D1 
366*1 
36721 
367B1 
26941 
36901 
36961 
37G21 
37Dai 
37141 
37201 
37261 
37321 
37361 
37441 
375D1 
37561 
37421 
37«1 
37741 
37B01 
37S61 
37921 
37961 
3BG4 2. 



Fig.2. 

CTTTTCTTCrC 
GGC5TATTG3 
CASGCATTTT 
TCAACCCACC 
CCAGATTGAT 

TATTCCGGGG 
AGA03GCTAC 
TAACOCAG^T 
AAWCCAACT 
TCAAAATTSA 
TTAGCCTCTA 
TAATCGATAG 
GCCOGTACAT 
CGCCGATCAG 
AATGAACCCG 
ATGGTTTTTA 
TGT AAgQC TA 
eCACTTTGGA 

agtaaac*ga 
tcat~cacat 
ccgaagcata 
tccgccacag 
gattgtaatg 
tagtgttgct 

TGT3G CATGT 
TGCTGCATAC 
GTGCTGCCGT 
AAACQATAAT 

tggtaacgct 

AACTCTTTAC 
TGrtACAGTT 
GAGACGGATG 
TCCTGTGCCC 
CAGTTTGCCG 
ATVTGCTTGG 

tctaacagat 
gctgcaat£G 

TCAATAATTG 
TCATCTTCCA 
rCTATATTCA 
ATCCCATTJG 
QTAATAAGCX 
AG TTITTTT G 
GCGGGGAAGG 
TG CAT A CG3 A 
GGCTTAAAGA 
GGATAGGTTA 
TTATCGT3CA 
TTAAAGTCA-5 
ATCAGTTTGA 
TAGGAAAGCG 
TS MTT AAGG 
CCTTTTOSTC 
TCGCAGACAT 
TSAAGACGGA 
ATCACAGCAA 
TACTCCAGCT 
AGSAACCCAA 
ATCTTTAGTJ* 
GAAAAT&2GG 
CTC5CTTCAG 
TGAACGCCAG 
GCGTTTttATG 



TCAGTGCATC 
GGTXftCCGTG 
CATAAACCW5 

CGATorrm 

TCTGCCAGGC 
AAGGGTATO3 
CCOGCTCCTC 
GAACCTGCTG 
G^AGGATAA? 
TCCACTOTSA 
GCCT^AACAT 
CGTTGA3CAA 
TTTlTAAGTC 
CTTCATAAGG 
CGTGGSTOK 
TACTCTTGCC 
TCATCTCCAG 
ATTTCAGATG 
COAAA*MCAT 
GGSAATTAAC 
GTATAAOGCA 
CC5CCAAGAC 
CGSCDGGAGT 
CGATAACTTC 
C5GCGCTCCTC 
CAATOGCTTO 
G^AAij 1 "Gr"l"T C 
ACTGAATCAG 
TCGGGATTGC 
GCCTOAGCAG 
•GATCCAAGOT 
TGAGATATTC 
TATCATATGT 
GTCC7TCACT 
CGAGCAGAA7 
OGACAffTACG 
TCTTGACATA 
CATCCGGATC 
GCCGTGAATT 
GCGGATTAAA 
GTCGCCACGC 
OTCCATTTAA 
CTCTATTCCA 
CAAACAOTCT 
AAACKSAATTG 
TTCTGAACTC 
GAAGTTCGGC 
AATCAAGAAC 
TCAGCG3GAA 
ATTOAOGGAC 
TACT ETC TTT 
TTCTTCCCGG 
CATCCTTGTa 

catcagcata 
aaccgaagoc 
caaacggaac 
ccatctgggc 
gcc*tcatat 
tcacccqcac 
tcagactgtt 

COTA^CAAAT 
GCATTGGGAG 
taatattggt 
ttaacaccat 
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ATATTGCAAT 
ATCGGCAATT 
TAAATCAGGT 
AAAAACCGCG 
AACCWAGAT 
ATAAACATTT 
AXATCAGTTA 
GCTQAACTCA 
ATCGCTCAGC 
AATOGCOCCT 
SCGS CTOT CT 
AACS'1T1 B VC5 
AGCAGrACTG 
CCCCAGC*AT 
ATAATCGCGC 
AATTTCCCAC 
AOGTtTGGTG 

TirrccGACC 

ATCATACTGA 
CGCATCATAG 
TCATTOGTAW 
CATCCCCCCC 
GGCWCTGAC 
CTGCTQSGTG 
AGCGGCCCGT 
CTGTTGCAGrt 
AAAATOTGTA 
CGTTTCTGCG 

ATCTTOCOGG 
CAGGTTATGA 
CGTATCAACA 
CACAGGCAGA 
GTTTTCATAC 
ATCAG3GCTG 
1SCACTGGAT 
GAAACGGAAT 
GCTT5G7TCA 
CCAGTAA CGC 
AATATAGTGC 
GACCAGAAAT 
ATCAATCGGC 
GCTCAGTACC 
ATTAASGCGA 
CACCTGATCC 
TTGA5TACAG 
TjTCA«5SGTG 
TTTTTCGCTC 
AGAACGGTTC 
AATCTCCAGT 
GCCOAAATCA 
aTTOCCGGAT 
TTCAWTCAG 
TTGCTCATCC 
TTCCTCATAA 
CA^AGCCAGA 
CftTCOGCTAT 
7TOGCATAAG 
7AGGCTCAAC 
CtTCAACTTC 
TGGGGTCTAC 
ArAACGCATC 
ATGGCT5THT 



4 



ACCTCGGTTT 
TCTTCCGOTC 
GAAATATTGC 
C TAT CAT AAA 
GCGCCTACTT 
TQAGACATAA 
GAA TTaT CTT 

OACTCCAGCA 
GTCOCTTCAA 
TCCATC5C05 
GGTPTOGTCT 
TAAAGCGTAT 
GOSGGCAATG 
AAGAACATTT 
TGTGATSATC5 
TTATGTTGCA 
AGCAGCCCCT 
CCTCTTTCGT 
CCTTGCAGGT 
AMTTGTTCM15 
ACGGCCAGAC 
TGGGCAGCGA 
ATGGAGATCT 
DGDCTGATCG 
TTGCGGGTAA 
TTGTCTTTTT 
GCCTCTTTTG 
ATGCCCGCCA 
CTGATGGGTT 
CGTAAGTTAT 
G^fTTTCACAA 
AGTGGCACGT 
AGAGCCACAT 
GTACCCAGTA 
GTCAGCTTAC 
ATTQCTTTCC 
ATTAACATCC 
GGTTTACCTT 
ACCCA7TCGG 
GGCATATGGA 
GTAGGGAATG 
TGCGGGATAC 
ATGTTTTGTC 
T5TTCATTGA 
CTGGCACTrT 
ATTC2ATTAT 
AGTACCAGTC 
TAATAlTGAT 
(3TCTCATCAG 
GGGTTCATTC 
AGAGCACCAT 
TAATACCAGA 
GGCAAATCA5 
TCATAATCCT 
AACGGGTT&T 
TGCAGATGTC 
C 6ATTT TGAT 
GTTTTCGTTA 
CGTCCAGGCA 
GCTCCCCAAT 
GGTATCAGGA 
ACGG^TACCr 
TCOGATAACT 



1TTCTCCCS0 
TC3GCCTCAC0 
GCTCWOAAT 
TGACATACCA 
CGCT G CTO SC 
TTTGACTTCC 
GTTTTAATTS 
CACTCACATC 
GCTOATCCre 
AAGOCAGGAA 
TTGAAATCAC 
ATTCCAACGG 
TOCTOAC5TTC 
ACAGCGCTAC 
OT3CGCTCAG 
TCAGTAATSA 
AATACGCCTG 
OATAAAGATC 
ACTGCCAOGA 
AAAC5CCGGAC 

CGAAAATATT 
TfCAClACCTTC 
TTTCATCATA 
TCAi^TOCATC 
AGCTGTACAJCJ 
TCTCCAGCAA 
CCCGGCTCAT 
TACGATTAGC 
CATCGTATAA 
ATAGACXJCTG 
ATAAATCAGA 
TGCT&ACAGT 
CTTGCAGCGT 
ACATATTCAC 
OGTATTCCAT 
OTTAGTGAAT 
GGTACACGOT 
OTTTGCTOGC 
TCGCCTCTTT 
AAAACAGTTC 
AAOCC&STAT 
CCTDGACTGGC 
G0GO3TTA7C 
GTWAATCAG 
CATTC5CCAAC 
COGACCCCAJG? 
GT TCTT CATC 
GATCTTT TAT 
TGCCATGAAC 
CGSTTTOSAT 
AGTAOTGTAA 
CCA&5TTC5CC 
TAATTTCTAC 
TACCTTTCTT 

TTCGOGCAGk 
CCGSTCAOGA 
TGCAGTGATA 
ATATACAGGC 
OCCAGGTCAA 
TGGCGTATCG 
CCGACAAAGA 
TrCAGGTTAC 



OGGTACATCA 
GACA1ATTGC 
ATC5CCAGCCT 
GOTTTCACCA 
•3TCADACATC 
G gCCOC GTTA 
ATGTTTATTC 
AC&CGOM-TA 
ATCOSAACCG 
AAQTTCATCA 
CAtACCTTOA 
OrTAAGCAAA 
TACCAGTGAA 
GGTTTTTATA 
TAAGAAA-3TG 
TTTTACCC5AT 
ATCCATCCGT 
ATTCCAGAGA 
GCCTTCGGCC 
A7TTGGC7GA 
NNMN»MMMMC 
GGOAACCATA 
AGCCGCTCTT 
GAGCGATTTA 
CAATGAAGCC 
TCCCAGTTGC 
ACTCAGTAAC 
GATCGGOGTG 
CACAACACGC 
TCCGGCCG5A 
ATCCAACXIT 
OGGTGC&3CA 
AAGCATTAAC 
ACGGOGTTGC 
GGAGTCATAG 
QT CTC CCTGA 
GGC5TTCACTG 
GGGTGGAGGA 
CnSAACAAGT 
TAATOSTTGT 
CCAGAAATAG 
AGGCTGTTCG 
AATCGCgATC 
AjGTTTCArCT 
TTC5GCGAATA 
ACCACCTTTG 
CTTGATTGAT 
CAAjGACAGTA 
CGCACCAAAC 
AAAATTOACA 
TCTCCGGGAA 
TD3ATA CGAT 
GA-LATATTTT 
CAGCAGTOTA 
ATCTGTCCCC 
CT3TATATCC 
ATGGTQGGTG 
AOGGTCUGGG 

ACGCAGTTGT 
GATTATTCAG 
TAGGTTTCCA 
AAAGATTCAG 
AGAACTTATC 
TGACATCTTC 
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3SXD1 AAAATTATTC ASJttAACCGA GCACCQCTTG T TCTAOtSAA TCTTCGCThA ™TCCCTS 
ATTAATCGCA CTTTCCAGIT: GGAAGfiAGAA TTCWITl'lA. TTCAGG^GTA ACftOSGUTTC 
34221 CAGATAGCTT TCtGGATAAG TCOSTAATAA OtMATCCC 



N=unspecified base 
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200 bp. 
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J_L 



flamHl 



HwrcflD 



Sad 
SamHl 
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i 1 i 1 



SamHI 



if 



Hindi 
Sa;i 

1 1 
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